Random selection of people (how to with survey package)

Hi all, (Or Hi IPUMS Staff :D)

I’m stuck on something that I’m wondering if anyone has any conceptual and/or applied feedback on.

To note: I use R to analyze ACS extracts with the survey package. Given the structure of ACS data - using either PERWT or REPWTP - is it possible to take random selections of the population?

For purely illustrative purposes, let’s say I have an ACS extract of respondents with STATEICP and INCTOT. Instead of just aggregating the income by state (mean, median) etc, what if I wanted to take a random selection of people per state - and then show summary stats and/or do an analysis on them? This is purely illustrative - but I do need to find a way to take the sample provided by an ACS extract, and then randomly select X% of them for an analysis.

The trick here is how to randomly select a portion of your sample (as a subset) to run a new analysis on. The problem I can’t seem wrap my head around is that given the nature of the data - where 1 row may represent 3 or 80 people etc. - how can you take a random selection?

My intuition is to create a binary variable in the data object (before converting it to a survey object) and called “randomlySelected” and have some pre-defined probability it is 1 or 0. Then once I create a survey object, I can subset to only data where randomlySelected == 1. The problem here is though, let’s say I want to randomly select 50% of the sample, while the variable may reflect roughly 50% of the data object, once I convert it to a survey object it may not represent 50% of the sample.

Any insight would be super appreciated!

It is possible to take random selections of the population using the Customize Sample Size feature during data extract creation, which automatically re-adjusts the weights to properly represent the population for a smaller sample size. Note that, in the ACS, individuals are clustered at the household level; this feature randomly selects households, with all members, which is the proper way to select a subset of cases in the ACS.

1 Like

First off, thank you!