Matching MSAs using the NHGIS data

I want to download a .shp file that will identify the metropolitan statistical area boundaries that match the “metarea” variable from the IPUMS US Census and ACS data, which uses the 1999 OMB delineations. I want to use the geographical boundaries to graph out certain values (such as the share of immigrants and income levels) at the MSA level. I just downloaded the 2000 Tiger shape files for the MSA/CMSAs through the NHGIS website, but I’m not sure which varaible identifies the “ID” variable. I read part of the online manual, which seemed to indicate that the variable “extent” is the ID variable, which seems to work when I ran the shp2dta command in Stata. I see values for the MSAs but I don’t see a variable that identifies which MSA is being identified. Is there a file that will show me which MSA is being identified by the MSACMSA code that was produced from the .shp file? Or better yet, is there an easier way to get the boundaries for the MSAs that will match the metarea variable from the Ipums USA data?

To help with your specific goal, IPUMS doesn’t currently have many resources beyond those you’ve identified.

First, note that the NHGIS shapefile that matches up best with the IPUMS USA METAREA variable for 2000 samples is the Primary Metropolitan Statistical Area shapefile (i.e., CMSA–PMSA, not MSA/CMSA). PMSAs nest within Consolidated Metropolitan Statistical Areas, and the detailed METAREA codes generally identify PMSAs, not CMSAs.

Second, at this time, we have no crosswalk between the METAREA codes in IPUMS USA (which are unique to IPUMS) and the official MSA/PMSA codes, as used in the NHGIS shapefile. As explained in the METAREA description, the METAREA coding system is based on the official 1990 system with some adjustments. I think the best way to match at this time is to review the metro area names from each source and make matches based on that.

Lastly, you should know that for many metro areas, the METAREA codes omit large portions of the metro area population. See the “User Caution” note in the METAREA description and the summary of Incompletely Identified Metro Areas.

Given these issues, the IPUMS USA MET2013 variable might suit your aims better. It uses 2013 MSA definitions, which may be a problem for some analyses of 2000 data, but it is available for 2000 microdata samples and has the advantage of using official MSA codes, so it’s easy to match to NHGIS 2013 shapefiles, and it better represents actual MSA populations (with a maximum mismatch tolerance of 15%).