PUMA codes for Chicago MSA

Hi,

I am working on a project that needs to create a tract-level map for Chicago-Naperville-Elgin, IL-IN-WI Metropolitan Statistical Area (MSA). To define the MSA boundaries, I use the MET2013, STATE and PUMA variables in 2019-2023 IPUMS 5-year dataset, and the 2020_Census_Tract_to_2020_PUMA crosswalk file on the Census Bureau website.

However, as I subset the Chicago MSA map based on the Tract-PUMA crosswalk results, I found an error: The map includes Ogle County (17141), which should not belong to Chicago MSA area. It also misses three other counties: Grundy (17063), Jasper (18073) and Newton (18111).

I compared the IPUMS data and crosswalk file and found that this discrepancy is due to one PUMA–State code is 17, and PUMA 5-digit code is 03700. In the PUMA file, this PUMA is classified under County FIPS 17000 (unidentifiable), while in the crosswalk file, it is under County FIPS 17141, which is assigned to Ogle County on the map.

So my question are: is the crosswalk file classifies PUMA 173700 to Ogle County by mistake, or is it the latest reclassification for Chicago MSA boundaries? If it is a mistake, which county should I assign this PUMA code to, to correct the boundaries and figures on the tract-level map? And what happened to the other three missing counties in PUMA codes that are supposed to be included in Chicago MSA?

Also, what is the best way to extract a tract-level MSA shapefile in R? I use the IPUMS + crosswalk files for other MSAs, which appear to work well, but now the Chicago map got me suspicious about the validity of this strategy. I am attaching a screenshot of the Chicago MSA map that I created earlier for reference, which includes Ogle County due to the tract-PUMA classification.

Any insights would be greatly appreciated! Thanks!

The variables MET2023 and MET2013 have inexact correspondence with official delineations. In the variable description on those variable webpages, we note that since 1990, the only sub-state-level geographic information available in public-use census and ACS/PRCS microdata is for PUMAs, areas which occasionally straddle official metro area boundaries. Given this limitation, MET2023 (and MET2013) cannot identify the exact set of households residing in every metro area. According to our 2023 MSA and 2020 PUMA crosswalk, which can be found on our 2023 Metropolitan Areas: Delineations and PUMA Correspondence page, the 2020 PUMA 03700 in Illinois is composed of 2 counties, Ogle and DeKalb, of which 65.98% of the PUMA population resides inside the Chicago-Naperville-Elgin, IL-IN MSA, thus we assign the whole PUMA a Chicago metro code.

Our geographer colleague saw your post and shared some ideas for extracting a tract-level MSA shapefile. The simplest way would be to get a correspondence file between tracts and CBSAs (core-based statistical areas) from Geocorr, join that to a tract shapefile, and select the tracts in the shapefile based on the CBSA/metro correspondence. Similarly, you could get a CBSA delineation file from the Census website and join it to a tract shapefile using state and county codes. Note, if you are working with Connecticut data, be aware of the changes in county coding between 2021 and 2022, which caused all the tract IDs to change.

1 Like

Thanks very much for the information and ideas! I eventually turned to a MSA county file from to subset tract-level MSA map by state and county codes. I will try the Geocorr package too!