2022 ACS estimates using MET2013

I have noticed that 1-year 2022 ACS is not harmonized well with MET2013.

When I tabulate the number of households for each MSA of MET2013 from 2006 to 2022 ACS, some MSAs in 2022 have a significant reduction in the number of households.

To figure it out, I looked at the number of PUMAs in San Diego MSA (41740) over time. There are several PUMAs in 2022 that are not included in the San Diego MSA, resulting in a significant reduction in households in 2022 San Diego.

San Diego MSA consists of one county which is San Diego County. Is this an error in which some 2020 PUMAs are not correctly identified by the MET2013 variable?

I found there are many MSAs that have similar issues in their 2022 estimates using MET2013.

Do you have a plan for updating MET2013 for 2022 ACS?

Yes, we just recently determined that there are some problems with MET2013 in the 2022 1-year sample, and we hope to make corrections soon. We’ll try to reply here when that’s done.

In the meantime, you might consider using the crosswalk we’ve posted here, which describes associations between 2013 metro areas and 2020 PUMAs. Any ACS respondent who resides in a PUMA with a majority of its population in an identifiable metro area should be assigned a code for that metro area. We believe the crosswalk accurately describes these associations, but the MET2013 variable is currently missing a number of associations between PUMAs and metro areas.

While we are currently investigating concerns regarding some MET2013 areas in the 2022 files (specifically those that are identified with a commission/omission error), the San Diego metro area is coded correctly using the 22 PUMAs that are all fully contained within the metro area. You can find the list of PUMAs that compose each metro area in the Crosswalk Between 2013 MSAs and 2020 PUMAs document on the resource page that Jonathan linked to. A tabulation of the data confirms that all of these PUMAs are correctly coded with MET2013 = 41740.

I suspect the issue you are encountering is that the codes for PUMAs change over different vintages; PUMAs are updated following each decennial census, creating a new vintage of PUMAs that may use codes from the previous vintage but associate them with a different PUMA. For example, PUMAs 7303, 7304, and 7305 (in STATEFIP = 6) were part of the San Diego metro area in 2012-2021, however, the 2022 vintage PUMA boundaries affect these codes. The area that PUMA 7303 encompassed (Vista City) from 2012-2021 is covered by PUMA 7323 in 2022-onwards (you can see an overlay of the two PUMA vintages on this page).

Thanks for your answer. After reviewing the dataset again, I doubt some 2022 PUMAs are not correctly matched with MET2013. I just looked at the 2022 ACS sample for the San Diego MSA.
I keep only the observations with statefip==06 & countyfip==073 since San Diego MSA consists of one county of San Diego.

There remain 32,966 observations. Among these, 8,087 data points are assigned as ‘not in identifiable area’ for MET2013. The other 24,879 obs are labeled as San Diego-Carlsbad.

The puma code of obs with ‘not in identifiable area’ is 7301. This code should be one of the San Diego-Carlsbad MSA (41740). I checked the crosswalk table and it reads like that.

Also, the observations with PUMA codes 7301, 7308, 7323, 7326, and 7329 are not identified as San Diego MSA for MET2013. As you mentioned, the 2022 vintage PUMA, like 7323 and 7326, affect these errors. These codes should be identified as San Diego MSA in met2013.

As a result of these missing pumas from met2013, the number of households in San Diego MSA (41740) dropped from 1,162,898 in 2021 to 908,104 in 2022. These estimates are unreliable. However, when I collapse the data by countyfip (073), the number of households in 2022 increased from 2021 which makes sense.

Could please you check this issue? I believe there are several MSAs in met2013 var have the same issue.

The problem you’re describing does not currently exist in the IPUMS USA online data analysis tool. When I cross MET2013 with COUNTYFIP for all 2022 records in California, there is a 100% match between the San Diego–Carlsbad metro area and San Diego County (countyfip==073).

However, while investigating the other problems with MET2013 (which I mentioned in my previous message), I discovered that an earlier version of our MET2013 definition had even more missing associations than the current version, including all of the cases that you’ve identified (California PUMAs 7301, 7308, etc.).

This suggests that you may have acquired your data earlier, before an update that corrected the associations for the San Diego–Carlsbad metro area. If you obtained your data recently (within the last 3 months), please let us know, and we’ll investigate further. Otherwise, please try to resubmit your data extract request to get a fresh version. I expect there are still some issues with other metro areas, but (hopefully) not for San Diego.

Thanks for the quick response!

My data was downloaded on Feb, 9th, 2024.
The issue on San Diego was solved by resubmitting my data extract request for a fresh version downloaded on Jul 29, 2024. This newer version also solves the same issue from New Orleans and Boston MSAs.

However, some other MSAs like Birmingham, AL and Hartford, CT still have the issue. I have a list of MSAs that had an unexpected reduction in thier households between 2021 and 2022 by comparing 2021 and 2022 1-yr ACS datafiles. I was unsure if it is shrinking of the MSA or data errors.

I hope the later version would solve the missing associations in met2013 and 2020 PUMA. It would be grateful if you can keep me posted when the newer version is updated.

I have several notes to report:

  • We’ve now corrected all known issues with our MET2013 codes in our 2022 ACS samples, and they should now match with actual 2013 metro areas as closely as possible.
  • We discovered that similar issues existed for 4 other geographic identifier variables (CITY, MET2023, MIGMET131, PWMET13), and we’ve published corrections for all of these.
  • More information about this revision and the areas that it affects can be found in the September 4, 2024, entry of the Revision History.
  • We determined that we’d made widespread corrections to MET2013 on March 7, 2024, but we hadn’t previously published revision notes on these corrections. We have now, so you can find information about these corrections in the March 7 entry of the Revision History.
    • This includes the corrections to the MET2013 identification of San Diego that we discussed earlier in this post.
  • Regarding Birmingham & Hartford specifically:
    • Our September 4 corrections associate one more PUMA with the Hartford metro area, reducing the percentage of Hartford residents omitted in our MET2013 coding from 20.5 to 7.0%. The new association also adds a small “commission error” (records with a Hartford code that did not actually reside in Hartford), totalling 0.9% of the population with the Hartford code.
    • Our September 4 corrections made no changes in our coding for Birmingham, AL, although our March 7 corrections did improve the Birmingham identification significantly (reducing the omitted population from 27.6 to 1.9%)
    • Importantly, neither the Birmingham nor Hartford metro areas can be identified perfectly given the geographic information provided in public use ACS microdata, and there may be some substantial changes in the populations that MET2013 associates with these metro areas that are due to the change in geographic units identified in ACS microdata from 2010 PUMAs to 2020 PUMAs, which occurs in the 2022 ACS sample.
      • As explained in the MET2013 variable description, many PUMAs straddle metro area boundaries, so IPUMS USA constructs MET2013 by identifying which PUMAs have a majority of their population in an identifiable city or metro area. This results in two types of mismatch between MET2013 codes and the corresponding metro areas–errors of omission and errors of commission.
      • I recommend you inspect our “match summaries” (available here) for both 2020 PUMAs and 2010 PUMAs to see how well the associated PUMAs represent metro areas in each sample.
      • The sum of mismatch errors for Birmingham changed from 10.8% to 1.9% between the 2021 and 2022 samples, and Hartford’s changed from 0% to 7.84%. This could explain a large portion of the changes in population that you’ve measured.
    • If you’d like a more exact measure of the populations of metro areas, I’d recommend using ACS summary data, rather than microdata. The Census Bureau publishes ACS summary tables for many geographic levels, including metro areas (i.e., “core based statistical areas” or CBSAs).
      • You can access ACS summary tables through IPUMS NHGIS.
      • Note that the definitions of CBSAs change occasionally, and the ACS summaries always report totals for a current definition of CBSAs. So if you want data for consistent, unchanging areas, you may need to obtain county-level summary data and use CBSA delineation files to associate the county data with a single year’s CBSAs.
1 Like