How to Create a household-level "tally" variable based on a population variable

I want to create (in the SDA) a household-level “tally” variable based on employment status i.e., number of workers employed in sample household). I want a household-level variable on number of workers (EMPSTAT=1) in each sample household. This variable would result in 0, 1, 2, 3, etc, up to the maximum household size.

In SAS I would create a dummy variable: employed1=1 if empstat=1; else employed1=0;
Then, proc sort this by household ID, then proc summary this by household ID.

I’m not sure how or if this can be done in the SDA!

Ultimately I would be producing tables on households by workers in households by vehicles available in household, etc.

Thanks in advance! Chuck Purvis, Hayward, CA

Unfortunately, this type of analysis cannot be done using the SDA system. There is no function that would set it to calculate sums by households. If you have any issues conducting this analysis in SAS, please let us know!

Thanks, Ivan

Rats, I was hoping I was just overlooking something.

For that kind of PUMS analysis, I’ll probably use the R package tidycensus to pull the PUMS data. tidycensus can handle the replicate weights. I can then use some of the functions in the R package dplyr to “group_by” household ID, and then to summarize my person level tally variable. I’m really trying to wean myself off of SAS.

A part of my research is on “who are households-without-workers”? Are they retired households, or households with disabled folks not in the labor force? or what?

Other person-level variables that I’m thinking of tallying to household level are voting age persons in HH (18+); and driving age persons in HH (16+). The purpose of driving age population (DAP) is to get a cross tab on households by number of driving-age persons in household by vehicles available. That helps understand “vehicle sufficiency” within households… What share of households have fewer drivers than vehicles?

Thanks for the response!


Hi Chuck,

If you’re trying to migrate your analysis to R, I highly recommend also taking a look at the ipumsr package. You might also find this in-depth webinar helpful in getting started.