I am working with a person-level extract from the 2010 ACS. Included in my extract is the variable HHINCOME, which measures income at the household level – and therefore is the same for every member of one household. What I want to do is create a measure of household income inequality that sums the total income of the top 5 percent of the distribution and divides that sum by the total income of the bottom 20 percent of the distribution. I want to create this measure for the dataset as a whole – that is, a national estimate - and also create individual measures for the largest 100 metro areas. I have two questions related to this task. Number one – it seems that if I do this calculation on the whole dataset I will be counting some household income values multiple times, so I need to somehow limit the universe. I think the way to do it is to select only those cases where PERNUM=1 and then do the sum calculation – but would love confirmation of that. Two – what do I do with weights here? Do I apply the household weights since I will only have one case from each household? To accurately calculate the sum it seems I need to multiply each HHINCOME by the associated weight before adding them together. I would appreciate any insight. Thanks!
Your intuition is correct on both questions. First, you are right that since you have multiple individuals per household in your dataset, you should limit your observations to only those with PERNUM==1. This will effectively ensure that only one observation per household exists in your dataset. Second, because you are using a household measure of income you’ll want to use the HHWT variable in order to calculate nationally representative statistics.
I hope this helps.
Fabulous – thank you so much for your help. I just needed to confirm that I was thinking through it correctly!