I can’t come close to replicating median household income (computed on full ACS microdata) using the ACS IPUMS sample. Using ACS IPUMs data, I find median household income of about $66,000 in 2015, whereas other sources using the ACS full microdata report that this number should be much lower, about $57,000. I am pretty sure that I am using the right weights (in fact, any weighting choice won’t close the gap). Restricting to non-group quarters, etc. don’t significantly affect this discrepancy. Does anyone know what could be going on here? The gap is huge.
I’m not exactly sure how you are running your calculation, but I am able to get relatively close to the estimates in this Census Bureau report.
A couple notes about data preparation may be helpful. First, when performing household level analysis such as this it is important to ensure that your calculation is counting only one person per household. An easy way to do this is to keep only observations with PERNUM==1. Second, in addition to dropping all group quarter observations (drop GQ>2), be sure to clean out any additional observations with HHINCOME==9999999. Finally, the calculations run by the US Census are not limited by confidentiality requirements that impose restrictions such as top and bottom codes on household income data. Therefore, although calculations should be close, we don’t generally expect to exactly replicate official US Census statistics with public use microdata.
Is it possible to limit observations to PERNUM=1 inside IPUMS USA? That option doesn’t seem to be available in the Select Cases option, even though there is an option to select cases for GQ. Ditto the HHINCOME=9999999. If I’m downloading to CSV, can you suggest a way to clean these observations?
You are correct that it is not currently possible to use the PERNUM variable in the select cases tool. I’ll suggest this to the IPUMS USA Team to consider in the future. In the meantime, you can download the data and read it into a statistical software (such as Stata, SAS, SPSS, or R) and limit the observations to only PERNUM==1 and then export as .csv.