I’m producing some county-level choropleths for each year from the US full-count censuses 1850-1940. I have read that I can use the IPUMS USA variable COUNTYNHG to combine census-level tabulations (eg. employment share by industry) with the NHGIS county boundary files. However, the county boundaries of course change over time, and I have not been able to find for which year (i.e. vintage) of county boundaries the COUNTYNHG variable is for–assuming it is for a single year, and not also changing each year (I was hoping to keep the county boundaries consistent across years). Would this be possible to clarify?
Thank you in advance for the help, and apologies if I missed something in the documentation.
The comparability tab for COUNTYNHG specifies that these codes are year-specific and the meaning of the code will vary based on census year. You can get the year-specific NHGIS boundary files via the NHGIS website (using the data finder, select the years of interest in the filters area and click on the “GIS Files” tab of the results). IPUMS does not offer harmonized county boundaries, you might be interested in my colleague Jonathan Schroeder’s approach for county crosswalks or these crosswalks from Ferrara et. al via Open ICPSR.
Thank you very much for the prompt and helpful response, and my apologies for missing that information earlier in the comparability tab for COUNTNHG.
Thank you for your earlier help. I have just a quick follow-up, as I’m now also using the 1940 full-count census data and have realized that this does not include the COUNTYNHG variable that is needed to match these counties to the crosswalk from Ferrara et. al via Open ICPSR. Is there another way to obtain the COUNTYNHG variable for 1940? As a test I tried replicating the COUNTYNHG variable in the 1860 Full-Count Census data by constructing another version from the STATEFIP and COUNTYICP codes, but this was only the same as the provided COUNTYNHG code in about half of counties. Another alternative might be using the PLACENHG variable available (only) for 1940, but this does not seem to include any county information (but maybe that can be obtained by matching these to the NHGISPLACE codes in the NHGIS place point shapefiles ?)
This seems to also be an issue with the 1900 full-count census data that does not include the COUNTYNHG variable, though perhaps for that year I can construct a crosswalk from the STATEICP /COUNTYICP codes to COUNTYNHG codes using the 1900 census 1% sample that includes both.
Thank you again for the assistance!
You’re likely to get a much better match using your method of concatenating STATEFIP and COUNTYICP for the 1940 full count than in 1860 due to how IPUMS NHGIS treats codes for territories that had not yet entered the union. For example, in samples when Minnesota was a territory, NHGIS uses the FIPS code 275. This would have affected your ability to match codes in 1860.
COUNTYNHG codes are 7-character county identifiers. The first 3 characters give the NHGIS state code and the last 4 characters give the NHGIS county code (both being FIPS + “0”). Moreover, since ICP county codes are also almost always (except for the case of Maryland) given in the format FIPS code + “0”, the combination of STATEFIP, “0”, and COUNTYICP, both of which are available in the 1940 full-count, should give you the correct COUNTYNHG code.
There are a couple things to be aware of however. First, since FIPS codes were instituted around the time of the 1970 census, historical counties that were dissolved before then have no FIPS code. For such counties, ICPSR generally appends a fourth digit of 5 rather than a 0. Second is that these codes should include leading zeros. For example, the NHGIS state code for Connecticut is 090 (FIPS code “09” + “0”) and the ICP county code for Fairfield county is 0010 (FIPS code “001” + “0”). The COUNTYNHG code for Fairfield county, CT therefore is 0900010.
Alternatively, IPUMS NHGIS 1940 shapefiles include the variables ICPSRST (state ICPSR codes) and ICSPRCTY (county ICPSR codes). You can build the crosswalk using these variables together with STATEFIP and COUNTYICP from IPUMS USA. Both of these methods are likely to yield only a few mismatches.