Household size instead of family size

I am currently doing an analysis using ASEC data from 2018-2022, and I’m currently using FAMSIZE (family size) variable. However, I would like to get a household size variable to account for residences with multiple households. Is there a way for me to get that in ASEC alone? Or do I have to somehow make a new variable using the family interrelationship variables available in ASEC? If so, is there some sort of pseudocode as to how to derive the household size of ASEC sampled persons/households?


There are two ways you can find the household size. First is to just count the number of records in the household. Each household in the ASEC in a particular year can be identified by a unique combination of YEAR and SERIAL. The second option is to include the variable NUMPREC in your extract. This variable gives the number of individuals in the household. Both options will give you the same numbers.

Hi Matthew,

Thanks for the response.
If I am excluding for non-repeats (MISH %in% 1:4), do you think I will get an answer that accurately reflects household size by counting YEAR+SERIAL for MISH %in% 1:4?

Additionally, for the second option, how do I use that NUMPREC variable in conjunction with the ASEC data? Do I have to exclude other individuals from the same household or can I just use the data extract as is (summary stats for NUMPREC on the entire ASEC population)? Do I just create a survey design that uses ASEC household weights instead of ASEC person weights and analyze NUMPREC using that design for summary statistics? In addition, how would I combine variables with different survey weights in subsequent regression analyses? (For example, if I wanted the predictors to include VETSTAT in ASEC (person-level), NUMPREC in ASEC (household-level), and HHINCOME in ASEC (household-level)).

YEAR+SERIAL will uniquely identify a household within a given ASEC sample. So if you exclude those with MISH=5-8, it should uniquely identify households in your dataset.

If you are calculating household-level summary statistics (including average household size), you should restrict your sample to one record per household (for example using PERNUM=1), and then use the household weights (ASECWTH).

Regarding regression, which weights you use depends on the unit of analysis. For any person-level analysis, even if you include a household-level variable as a regressor, you should use the person weights (ASECWT).

Hi Matthew,

Thanks so much for your help again it was very insightful.

I have one more question/clarification about YEAR, SERIAL, and MISH.
If YEAR + SERIAL uniquely identifies a household within a given ASEC sample, that means that YEAR + SERIAL is only effective in identifying a unique household in a single ASEC year – for a dataset of combined ASEC years where there are consecutive years, I would need to additionally filter out MISH 5-8 to guarantee that there are no repeats from a consecutive year, correct? So the combination of YEAR + SERIAL + keeping MISH 1-4 for a 5 year span of ASEC data would ensure a dataset with unique persons and households ?

That is correct. If you do not filter out those with MISH 5-8 you would get repeat households. If you really want to be sure, you can also look at CPSID, which is a longitudinal identifier, and is the same for a given household across all of their samples. So you will see the same CPSIDP household appear twice in the ASEC dataset, in two consecutive years. You can then drop either the first or second year to keep only one instance of each household.

Another way to do this is to download a longitudinal ASEC extract, and keep only variables from the first or the second year.