Weights when estimate includes person and housing unit variables

How do I employ weights when I want to get an estimate of the number of housing units in a community that rent for an amount above $2000 and are occupied by at least one individual over the age of 55?

How you would go about this depends on if you are hoping to conduct this analysis using the online data analysis tool (SDA) or analyzing an extract in a statistical package. I have a few comments that I hope can help you accomplish your goal.

First, if you are hoping to identify households where at least one member is age 55 or older, you will need to use the variable SERIAL, which is a sample-specific unique household identifier, to identify the ages of all members of the household. The SDA tool can estimate the number of people 55 years of age or older who live in a household where rent is above a certain value, but since your query requires referencing other members in the household, this cannot be done in the SDA (but can be done in a statistical package with an extract.

Second, depending on what you mean by “community,” you may or may not be able to achieve your desired level of geographic precision using IPUMS USA data (with or without the SDA). The smallest explicitly identified geographic area in the microdata is the PUMA (public use microdata area), which is a geographic area of at least 100,000 residents, defined by the Census Bureau. Our IPUMS geographers are sometimes able to infer county and/or metropolitan area from PUMA, depending on how the boundaries of PUMA and other geographic areas align. See COUNTYFIP and MET2013/MET2023 for more information.

Finally, addressing the question of weights, you will need to use HHWT (our household level weight) since this is a household level analysis. For household-level estimates, you will also want to restrict your sample to a single respondent per household using the filter PERNUM = 1. You can find sample R code for incorporating weights into your analysis in these IPUMS training exercises. You may additionally want to use replicate weights when using ACS data to calculate empirically-derived standard errors or confidence intervals.

Thank you, Isabel. I appreciate your very complete answer to my question.

1 Like