I am working with the ACS sample. It seems the number of counties appeared in the sample decreased from around 370 to around 330 starting 2012. Is there a reason for this decrease? Besides the change in 2012, are there any other intended geographical coverage changes? Thanks a lot!
The ACS public use microdata sample that IPUMS harmonizes and releases does not report the county that a respondent household resided in. IPUMS USA imputes the county using PUMA, the smallest geographic identifier that is available in the data. As a result however, the variable COUNTYFIP will only identify a county if and only if it was coterminous with a single PUMA or if it contained multiple PUMAs, none of which extended into other counties.
The change that you observe in 2012 is the result of the redrawing of PUMA boundaries following the 2010 decennial census. The redrawing replaced PUMAs that were created following the 2000 decennial Census with PUMAs that maintained the required threshold of at least at least 100,000 residents at the time of the corresponding census. This change then affected which counties could be identified in the data. A table that lists counties and the years they are identified in is available for download from the COUNTYFIP variable description page. These 2010 census based PUMAs were used in the ACS data from 2012-2021, after which PUMAs were again redrawn to reflect population data reported in the 2020 decennial census.