I am working on crosswalking several longform Census datasets between 1990, 2000, and 2010. To ensure that I am doing this properly, I thought I would test out my approach by applying the exact same methodology to a shortform dataset for which there is already a time series available. Please check my logic here, but if I am doing the crosswalk correctly, then my values for 1990 and 2000 should match those on the time series, right? While the 1990 data are matching up perfectly, the 2000 data are not (see below). I’m wondering if anyone might have some insight as to why this discrepancy may be emerging.
Here’s my general approach:
Download time series data for Total Population, Standardized to 2010 by Block Group (“nhgisXXXX_ts_geog2010_blck_grp.csv”).
Download 1990 population data for block group parts (“nhgisXXXX_ds120_1990_blck_grp_598.csv”) and 2000 population data for BGPs (“nhgisXXXX_ds152_2000_blck_grp_090.csv”).
Perform crosswalk analysis using the appropriate crosswalk lookup tables (“nhgis_bgp1990_bg2010_49.csv” and “nhgis_bgp2000_bg2010_49.csv”). You’ll notice from the “49” that I’m just working on Utah data. This is done in R (happy to share code if it will help…). Basically, I just (1) perform a join between lookup tables and block group part data, (2) multiply population totals by population weights, and (3) sum up the resulting weighted population values per block group.
Compare crosswalked values to time series values.
Any help or insight would be great! Thanks in advance.