How do I account for HHWT when looking at data about number of households in a state from sample w/ all states?

I am looking to calculate how many households in each state are under 200% of the poverty level. I am doing this by recoding HHINCOME into new variables for every household size based on poverty guideline income ranges that involve the number of household members (ex. 2 member households are under 200% of poverty level if their income is <31,860). I am using NUMPREC to determine household size and PERNUM=1 to represent the household. I put on the HHWT before I run crosstabs. However, whenever I weight the data, the amount of households becomes extremely high. For example, Alaska weighted has 556,440 households but in reality census says it’s only 250,185. Should the weighted number reflect reality? I am wondering if my use of weights is incorrect because I am looking at state-level data within a sample for the whole US?

Calculating weighted state-level statistics shouldn’t be a problem for the household sample weight variable. I suspect the issue is that for household-level analysis you need to ensure that you have only one observation per household in your dataset. The easiest way to do this, with IPUMS USA data, is to simply keep any record with PERNUM==1. Once you do this, you should calculate population household counts much closer to the official published US Census counts.

I hope this helps. Let me know if you have any additional questions.

1 Like

I have a similar question related to household count. When I download data for a MSA, here St. Louis #41180, using “ownership” as the only other selected field I get a household count of about 2.7 million… that is I added all the HHWT field entries for some 25 thousand records. But when I used the SDA online tool, filtering only for the MSA and pernum(1) I get a household count of 1.2 million. What am I doing wrong?

It sounds like you may not be restricting your sample to pernum==1 in the extract that you downloaded. This is equivalent to the pernum(1) specification that you included in the SDA tool, which is correct. If that’s not the case, let me know and I can look into it some more. Also, can you tell me which sample you are using?

1 Like

Thanks Matthew… you likely identified my problem. To get to pernum==1 do I check the boxes as I’ve done below or is there something else I should do?

There are two ways to restrict your sample to those with PERNUM==1.

The first (Edited 12/17/2020) is to change the data structure (see screenshot below) to “Household Records Only.” You’ll still be able to select cases for OWNERSHP and MET2013 (but will lose any person-level variables that you may have added to your extract).

The second method is to load the full extract into your statistical software package, then drop all cases that have PERNUM>1. The exact way to do this differs based on which software you are using.

Matthew – Sorry to be delayed in thanking you for your response. Regarding approach #1, how do I use the ‘select cases’ button to filter for pernum(1). Select cases only allows me to select from the variables I had identified for the data run (i.e., ownership and met2013)?

Sorry, I gave you wrong information earlier! The Select Cases functionality doesn’t work for PERNUM, as you discovered. I’ve edited my earlier response to remove the misleading information. But there is a different way to select only one record per household in IPUMS USA. Choose to change the data structure (see screenshot below) to “Household Records Only.” You’ll still be able to select cases for OWNERSHP and MET2013.

Matthew… This guidance worked! Thanks for sending the correction.