I’m trying to do a simple count of households by geography using the sum of hhwt values and filtering out duplicates of serial numbers, using the 2021 1YR ACS. When I do this for the state of Minnesota I get a total of 2,281,037, very close to the ACS table S1101 total of 2,281,033. However, when I follow the same approach for a metro area I get different results. I used IPUMS MET2013 #11700 for Asheville NC and came up with a total household count of 221,068, while the Census household total for the MSA is 185,423. Might Census and IPUMS be using different geographies for the Asheville MSA?
MET2013 uses 2013 metro area definitions while the 2021 ACS summary data use 2020 metro definitions. In addition, as explained in the MET2013 variable description, MET2013 is not always exact because the source PUMS files do not identify any sub-state geographies other than PUMAs, and MET2013 identifies only the metro area that respondents are likely to live in based on available PUMA information. MET2013 allows for a 15% population mismatch tolerance between PUMAs and a 2013 metro area, which is based on 2010 population totals for samples that use 2010 PUMAs, as the 2021 ACS sample does. This means a population summary based on MET2013 could be off by more than 15% from the ACS summary data if there has been significant population change in the area since 2010 (as is the case for Asheville in 2021). The MET2013 variable description also provides links to more information about the degree and type of mismatch for each metro area.
Got it. If I want household estimates for metro areas is Met2013 still the best variable to use (if I want to avoid the metro by metro summary of PUMAs)? Related, is a Met2023 in the works at IPUMS, and if so when might it be released?
To get metro-area estimates from current ACS microdata, MET2013 is the best available option in IPUMS USA.
By “metro by metro summary of PUMAs,” do you mean deriving metro-area estimates based on PUMA info? If so, that’s basically how MET2013 works already. One thing you could achieve by going back to the PUMAs yourself is to generate info for metro areas that MET2013 doesn’t identify because their best-matching PUMAs exceed the 15% population mismatch threshold, or you could weight residents who live in split PUMAs according to the proportion of the PUMA population in the metro area, which would get you closer to the totals from the ACS summary tables.
Alternatively, IPUMS NHGIS provides ACS summary tables for metro areas (along with micro areas as “core-based statistical areas”).
Yes, we plan to add a MET2023 in the future. I’d speculate we’ll get there in a year or two. Our first priority will be extending MET2013 into the 2022 ACS samples, which will require extra effort because the 2022 samples will be the first to use new 2020 PUMAs.
Thank you for clarifying that 2022 will be the first to use the 2020 PUMAs!
Thanks for this response. Because I’m just looking at larger metro areas I think the MET2013 variable will meet my needs. Appreciate IPUMS doing the work to get these metro aggregations.