Variables in NHGIS files

Hello NHGIS folks,

This is Dafeng who previously worked in MPC! I have a question about NHGIS variable —

When I choose to generate farm population and urban population variable between 1870 to 1950, I just cannot find these variables after I “apply filters”. Here is what I do: I choose “state and county” as geographic levels, and then choose “1870 to 1950” as years. But there is nothing about farm status and urban status in “Select Data”. I am wondering what is missing here? I am sure that these variables were surveyed in these census years.

A related question: there is indeed a variable called “average value of farmland” retrieved from the agriculture census data, at the county level. But when I try to generate the table, NHGIS continuously warns me that “One or more tables lack a geographic level selection (see below)” but I have chosen state and county. Did I do something wrong here?



Hi Dafeng,

When I follow the steps you describe, I don’t get the same results. I’ve attached a couple of screenshots to show you what I see…

First case:

Second case:

In the first case, you should make sure you’ve selected the tab for “SOURCE TABLES” in the SELECT DATA area. There are no time series tables for these filter selections, so if that tab is selected, you would see no tables available.

In the second case, you should make sure that in the SELECTIONS area, you have “2 of 2” Geographic Levels selected. If your screen says “0 of 2”, then you still need to select the states and counties levels for your request.

(If you selected State and County data filters on the first page, those selections are applied only for datasets you add to your data cart after you selected those filters. Any datasets that you add to your cart before you select Geographic Levels filters will have no geographic level selections.)

If you’re still unable to get desired results, please share screenshots of the problem scenarios you describe. That would help me diagnose what’s going on.


Hi Jonathan,


I guess what I did was that I chose “ALL” but not “OR” for year. The results are, when I choose “OR” rather than “AND”, variables with the same code (e.g., “NT2”) are generated for each individual year, and I need to append the tables together. Does it mean that the variable is inconsistent over time, even if it has a consistent code?



First, some notes on terminology… Your example of “NT2” identifies a table code, not a variable code. Each table contains one or more variables. Each variable corresponds to a single summary statistic (e.g., number of farm families that own their farm), which corresponds to a single column in an output data file.

NHGIS supplies two types of tables: “source tables” and “time series tables.” There are separate tabs for these two types below “SELECT DATA” in the interface. The source tables are not harmonized across datasets. (This is similar to the unharmonized “source variables” on IPUMS microdata sites). Codes for source tables are unique to each dataset and don’t indicate any meaningful correspondence with tables in other datasets.

So to answer your question: a source table code of “NT2” does not indicate any consistency with NT2 tables in other datasets. It only indicates that it’s the second table in a particular dataset.

NHGIS time series tables are more like “harmonized variables” on microdata sites. We design the data in time series tables to have consistent meanings across time. But the coverage of time series tables is limited–largely because it’s quite difficult and time-consuming to harmonize summary data–so there are many subjects that are covered in NHGIS source tables that are not covered in the time series tables.

Thanks!!! This is very clear. Thanks for your reply Jonathan.