Trouble replicating ACS data. Even simple calculations.



I’m trying to replicate some custom analysis, but first wanted to check whether my basic numbers are right. To do that, I’m simply trying to recreate the median age of the United States, as published in the 2014 single year ACS (37.7 years).

To do this, I am using both the online IPUMS tool, and R. With both methods, I get the same result: mean of 38.13, mean of 37 (If I break the ties in R, I get a median of 36.7) .(Note: I get an accurate population estimate, so that is encouraging).

I’m using age in the row category, and a filter for 2014 on the online tool. Also, person weights.

Is it even possible to replicate this exactly using IPUMS data? Or am I just using the wrong methodology?




I am not absolutely sure what it is exactly that you are doing, but the published public use data (PUMS, including distributed via IPUMS) represents only a fraction of all ACS data (about 40%, I believe). The rest is redacted as a confidentiality protection measure. So the results using IPUMS would rarely if ever agree with 6 digits to those provided by the FactFinder (which relies on complete ACS) or otherwise published by the Census (which may be using additional adjustment methods on the back end).

Then there are different concepts of what the generalizable population is, like civilian noninstitutionalized population, total residential population, etc.



Thanks! That’s good to know.

Something else is troubling me though now. I get the exact same 2014 1-year population estimate using PUMS that is published in Fact Finder: 318,857,056. If this only represents about 40% of respondents, I can’t imagine I’d get the same population figure.

Edit: Now that I’m thinking it through it more, the person weights are probably setup to match this number. I.e., those who are included in the PUMS are probably weighted more heavily so as to match this population figure. Does that make sense?