Crosswalk 1920 counties to 1990 counties

I am trying to crosswalk the 1920 counties to 1990 counties based on the population-weighted crosswalk developed in this paper: New Area- and Population-based Geographic Crosswalks for U.S. Counties and Congressional Districts, 1790-2020.

Right now, I am a little bit confused about the actual number of counties in 1920. The Census data indicates (based on countynhg) that 3063 different counties exist. However, my crosswalk file has 3075 distinct counties (also based on countynhg). By double checking with other available crosswalks, I found that they also report 3075 different county codes.
As this leads to missing values when joining the Census data with the crosswalk, I was wondering what the source of this problem could be…

I appreciate the support!

Dear Lennart,

From IPUMS USA, I created a data extract (on Dec 16, 2022) containing all records from complete-count 1920 census. The extract included the countynhg variable. I then contracted the microdata file on countynhg to determine how many unique counties are represented in the complete-count 1920 file.

My contracted dataset contains 3074 counties, which is quite comparable to the 3075 counties in the crosswalk you linked to. There is one county in the crosswalk (Kalawao county, Hawaii) that isn’t in my contracted dataset.

Can you provide a list or some examples of the 12 counties that are in the crosswalk file that aren’t in the census data that you’re analyzing? That should help us in our investigation.

Dave Van Riper

Dear Dave,

I used the 1920 1% Census.
I had the same idea with using the full-count Census from 1920 for checking the problem. Thereby, I was able to identify which counties are missing in the 1% Census.

As the full-count dataset is, unfortunately, too large to work with (at least for me), I simply transferred the persons living in the counties that were missing to the smaller dataset
Missing_counties_1920.dta (53.4 KB)
Attached you can find a file indicating the counties that were not part of the 1% Census.


The counties that are omitted are among the smallest by population in the US in 1920. I compared these counties to the county population tabulations via IPUMS NHGIS in 1920 and they rank 1-5, 7, 9, 11, 15-16, and 27 (when listing counties from smallest population to largest)–the majority have fewer than 500 people. Accordingly, I am not surprised that some are not included in the 1% sample file. However, I have a request out to some colleagues for any additional information sampling for 1920 that might be relevant (particularly for the slightly larger counties).