Weighted counts substantially off from official Census estimates

I’m having trouble obtaining a weighted count of the number of children under 5 years of age that is within the margin of error around the official estimate reported on data.census.gov. We will be doing more complicated analyses, but I thought I’d run this as a first pass to make sure everything is in order, and it’s not. I can obtain an estimate of the weighted count of children age 5+ that is similar to the official estimate. But I’m having trouble with children under 5 years of age. I tried different years (2019 1-year estimates and 2015-2019) and in both cases, my estimate is not within the margin of error of the official estimate. More specifically:

The Census Bureau Table S0101 for the 2019 ACS reports an estimate of 19,404,835 children under 5 years of age (+/- 22,314).

My program (using the 2019 ACS) returns an estimate of 19,315,823. The discrepancy is 89,012, which is pretty far outside the margin of error around the census.gov official estimate.

I download the data, used the IPUMS generated program, and the following program:

* ACS 2019 Weighted Count of Children Under 5 Years
* data.census.gov shows 19,404,835 (+/- 22,314)

gen underage5=.
replace underage5=1 if age<5

svyset cluster [pweight=perwt], strata(strata)
svy: tab underage5, count format(%11.0gc)

* My estimate from above is 19,315,823
* Lower than official estimate by 89,012

Can anyone spot my error, or is there something about very young children that the IPUMS weight doesn’t get right? Thank you so much for your help!

We generally do not expect to exactly replicate official statistics with public use microdata for the ACS. This is because the public use microdata uses a slightly different sample than what is used to generate “official” statistics. You can read more about this detail on this page.

I estimated the number of children under 5 years old using the 2019 ACS PUMS data available from IPUMS USA and I arrived at the same estimate that you did. As a side note, I recommend you use replicate weights for your estimates (which won’t change the point estimate, but which will change the standard errors). Although the estimates from the microdata are usually very close to the official estimates, they do not always fall within the margin of error of the official estimates. We shouldn’t expect them to, either. This is because the estimates from microdata have their own margin of error, which will be somewhat larger than the one for the official estimates. If you ran a t-test for the difference between these two estimates, the difference may not be statistically significant.

Thank you so much! That is really good to know. We do plan on using replicate weights, but I wanted to get my point estimates close first. Thanks again. This forum is an amazing resource.