When comparing the US Decennial Census data (5% sample) with ACS data, should I use weights for the ACS data to make them comparable? Or do I need to use weights for both the decennial Census and the ACS? Thanks a lot for any advice!
You should use weights for both samples in order to make your estimates from each source representative of the population that you’re analyzing. The weight variables indicate how many persons in the population are represented by each sample case. This will vary case-by-case and sample-by-sample. You should use PERWT for generating person-level estimates (e.g. the population of each state or the proportion of people under a certain age) and HHWT for household-level estimates (e.g. the percent of housing units that are owned by their inhabitants). You can find more details about weights on the FAQ page and the sample weights page. Apologies for the very delayed response. I hope this information is still helpful for you.
Thanks a lot! Yes, in fact your response is still very helpful for me. The percent of housing units that are owned by the household is precisely what I’m interested in. Here, I have a little follow-up question: If my dataset is restricted to only 1 person per household (the householder), what difference does it then make whether I use HHWT or PERWT for my regression (ownershp is my dependent variable)? I understand that HHWT would be more appropriate but was just wondering because I seem to have some issues concerning collinearity when I use HHWT, but not when I use PERWT. Thanks in advance!
If your dataset is restricted to one person per household, it technically doesn’t matter whether you use PERWT or HHWT for your analysis. While HHWT is the intended weight to be used, both will give you household-level estimates. As the ACS Design and Methodology Report (pg. 24) notes, this is because “the housing unit weight is equal to the person weight of the householder.” While technically the weights are not always identical, they are similar enough that the difference in your estimate will typically be “within a tenth of a percent at the county level.”