Is the "county" variable in the 2012-2016 IPUMS USA valid?


#1

I read that the county variable is not available after 1950, however I was able to include it in my 2012-2016 IPUMS USA extract.

However, there is only data for 35 of the 58 California counties. Are these estimates for the 35 counties valid? If so, how come all the counties are not represented?


#2

Yes, you are correct that counties are not identifiable in public use microdata from 1950 onwards. This is due to confidentiality restrictions that limit the identification of individuals and households within geographic areas that include “too few” observations. In IPUMS data, however, some counties are able to be recovered due to other low-level geographic identifying information. A list of which counties are identified in which sample years is available here. You should be able to calculate reliable estimates for the counties that are identifiable, but the coverage of these counties within US states will be incomplete. More information about “missing” US counties is available via this blog post. If you are looking for aggregated statistics at the county level, you may find IPUMS NHGIS to be a useful resource.