About 97% counties are metro in ACS microdata?

I downloaded the ACS microdata from IPUMS over multiple years. I could be wrong in processing the data, but I found that about 97% of the counties are metropolitan counties (based on the “metro” variable and through linking rural-urban definition from OMB). Does this mean that ACS has been sampling individuals primarily from metro counties?

This is a great question–the answer is a bit more nuanced than this. Detailed geographic information in the ACS public use microdata sample (PUMS) is suppressed by the Census Bureau for confidentiality reasons. The lowest level of geography released by the Census Bureau is the PUMA; PUMAs must contain at least 100,000 people and often contain several counties when the population of each county is small. You can see which counties in each state are identified in each year by looking at the file linked on the description page for COUNTYFIP. About half of persons in the ACS PUMS data live in counties that can be identified. The ability to identify a singular county is dependent on PUMA boundaries aligning with county boundaries; this is more likely in a metropolitan area given the population requirements for PUMAs. Accordingly, I would interpret your finding (which I have not tried to replicate) as about 97% percent of respondents who live in identifiable counties are identifiably in metros.

OK. I think I got your point but still cannot explain this uncomfortable result (e.g., 97% of the residents come from metro counties) in plain language, could you please rephrase your explanation briefly? Thanks.

I’ll try to explain it another way… It’s not accurate that “97% of the residents come from metro counties.” The ACS is a nationally representative sample and doesn’t oversample in metro areas. E.g., If you look at the “case-count view” for the METRO variable you’ll see that about 184,397 households in the 2019 sample were in PUMAs that are completely non-metropolitan (code 1), which is 11.9% of all sampled units. Another 15.9% are in PUMAs that have an indeterminable (“mixed”) metropolitan/non-metropolitan status (code 0).

It’s also not accurate that (as you said in your first post) “97% of the counties are metropolitan counties.” This wording suggests that you limited your computation to the counties that are identified by IPUMS USA (via the COUNTYFIP or COUNTYICP variables).

Maybe you measured this:

100 * (the number of identified counties that are metropolitan) / (the number of identified counties)

Or maybe you measured this:

100 * (the population in identified metropolitan counties) / (the population in identified counties)

Either calculation would omit a very large number of U.S. counties and the ACS respondents who live in them (about half of all respondents).

As Kari explained, IPUMS USA can’t identify all counties because public use microdata do not provide adequate geographic detail to determine the county of residence for many respondents. Most non-metropolitan counties have populations that are too small for the ACS to provide information that would allow us to identify the specific counties. This doesn’t mean that there are no ACS respondents in those counties; it just means that for those respondents, IPUMS USA can’t identify the county.

So, in summary, PUMAs are equivalent to counties, but their boundaries are different. A PUMA could cover several counties, or a county could be split into several PUMAs. But the METRO variable can be used to distinguish them between mixed, not in metro, and in metro, but the unit for the METRO variable is a PUMA area, not a county.

Yes, that’s about right. I wouldn’t say PUMAs are “equivalent” to counties, but they are similar in that both are geographic areas that subdivide the entire country into a few thousand units.

And you’re correct about how METRO codes are determined. The METRO description explains it this way:

In many public-use microdata samples, metropolitan and central/principal-city status are not directly identified. In such cases, IPUMS derives METRO codes based on other available geographic information, e.g., county groups (CNTYGP97 and CNTYGP98) or Public Use Microdata Areas (PUMA). If a county group or PUMA lies only partially within metropolitan areas or central/principal cities, then METRO indicates that the status is “indeterminable (mixed).”

NOTE: If you need ACS data summarized for all counties, including non-metropolitan counties, or if you’d like data that exactly distinguishes metro and non-metro population (with no “mixed” category), you could use ACS summary data, which IPUMS provides through our NHGIS site. Summary data is limited to the specific cross-tabulations that the Census Bureau defines, which lacks the flexibility and robustness of microdata, but it provides information for much smaller areas than PUMAs.

Thanks for the help. I really appreciated that.