Applying nhgiscode from time series file to ACS 2005-2009 census tract observations

Hi all, I know that there are some “idiosyncrasies” that prevent the ACS 2005-2009 tables from being incorporated into time series format along with the other five year estimates. I tried to get around this by downloading the relevant variables for the 2005-2009 file and appending them to the time series file with the other five-year estimates. This works for many cases; however, I need to link the appended file to another data set I created containing five-year averages of the home value variable – also not seemingly available in time series format.

It looks like the best way to make this link is to use the nhgiscode variable. This, however, isn’t included in the ACS 2005-2009 file (it is included in the housing value file but under a different name). I tried to solve this by sort my appended data set by state fip, county fip, and census tract number and then “copying over” the nhgis code from later years to the 2005-2009 observation when I could confirm that all geocode components listed above matched. This still leaves a large number of observations from the 2005-2009 vintage without a nghis code.

I’m wondering if anyone has advice on filling in the remaining blanks.

Thank you,

Kelsey

I’m not totally sure I understand the issue you’re encountering, but I’ll provide some relevant info here that could help explain it. If this doesn’t clarify the problem for you, then feel free to reply again. In that case, it would help if you could provide a specific example of a problem case where you weren’t able to join a 2005-2009 ACS tract record to the time series table.

Importantly, the NHGIS time series tables that contain ACS data are (currently) only “nominally integrated”:

Nominally integrated tables link geographic units across time according to their names and codes, disregarding any changes in unit boundaries. The identified geographic units match those from each census source, so the spatial definitions and total number of units may vary from one time to another (e.g., a city may annex land, a tract may be split in two, a new county may be created from parts of others, etc.). The tables include data for a particular geographic unit only at times when the unit’s name or code was in use, resulting in truncated time series for some areas.

In the case of census tracts, there are many changes in tract definitions with every census. In a nominally integrated time series table, there will be some records for tracts that existed in 2000 but not in 2010, and vice versa.

The tracts in the 2005-2009 ACS mostly correspond to 2000 census tracts, so the tract data you get from the 2005-2009 ACS will mostly match up only with the records for 2000 census tracts in the time series tables. There may be many records in the time series table for tracts that existed only in earlier or later years that won’t match with the 2005-2009 ACS tract data.

The NHGISCODE identifiers in time series tables correspond to the “GISJOIN” fields that appear in NHGIS “source table” data files from each specific year. Every GISJOIN ID in a 2005-2009 ACS tract data file should have a match in the NHGISCODE field in the time series table, but there will still be many records in the time series table with no match in the 2005-2009 ACS tract data file.

Note: One of the significant “idiosyncrasies” with 2005-2009 ACS tract data is that there are 19 counties where the ACS tracts don’t match with 2000 census tracts and instead represent preliminary versions of 2010 census tracts. We provide more info about this issue on our GIS files page. I would guess you could still use the NHGISCODE to match most of the tracts in these problem counties, but there may be a few tract codes that existed only in the 2005-2009 ACS data and therefore don’t appear at all in our nominally integrated tables.

1 Like

Ope! forgot to respond. This helped significantly, Thanks Jonathan.

1 Like