Why do I fail to replicate published ACS aggregates?


I am interested in counts of households in the 5 year ACS samples. As a first step, I am trying to benchmark the national total to aggregates published by the Census Bureau, but they are off by about 7 million.

I load the 2015-19 ACS microdata and sum up the HHWT (counting each household once), using the following Stata code:

do usa_00024
qui sum hhwt if pernum == 1 & year == 2019
di %9.0f r(sum)

The result is 128,847,014. I expect to see 120,756,048, as in the published aggregate here (DP02 2019 5Y): Explore Census Data

Why does this discrepancy exist? What do I have to do to IPUMS data in order to have the same definition of households used by the Census Bureau in calculating household counts?


Hi Ben,

Your estimate using IPUMS ACS data is including people living in group quarters, which are not considered households by Census. When restricting your data to respondents with GQ values of 1,2, and 5, IPUMS ACS data should give you an estimate of 120,756,015 for the total number of households in 2019. Though the numbers still don’t perfectly match, this discrepancy is due to official statistics being calculated from restricted-access data that is not available through IPUMS.

1 Like

Thanks Ivan!