Can I use NHGIS Crosswalk for Block Group Level Data?


I have been able to use the NHGIS crosswalk to interpolate Census 2000 block level data to Census 2010 block boundaries. How can I do the same with Census 2000 block group level data? I would like to use source variables that are only available at the block group level.




The NHGIS block-to-block crosswalks can help for this application, but they are not by themselves sufficient. Some additional modeling is required.

Let’s say for example that you want to interpolate 2000 foreign-born population (FBP) from 2000 block groups to 2010 block groups. A general workflow is:

  1. Dis-aggregate FBP from 2000 block groups to 2000 blocks

  2. Use the NHGIS block-to-block crosswalk to interpolate FBP estimates from 2000 blocks to 2010 blocks

  3. Sum the 2010-block-level estimates of 2000 FBP within each 2010 block group

Steps 2 and 3 are relatively straightforward. For Step 1, you’d have to estimate how 2000 block-group populations are distributed among 2000 blocks.

In this example, it would make sense to allocate FBP among blocks in proportion to the block population. I.e., you’d assume that the distribution of FBP among blocks within each block group is proportionally the same as the population distribution. So if one block contains 3% of the the block group’s population, you’d assign that block 3% of the block group’s FBP.

For other statistics, it would be better to use other block characteristics to guide the disaggregation. E.g., to disaggregate aggregate household income, it would make sense to use block-level household counts. To disaggregate population 25 years and over with a bachelor’s degree, it would make sense to use block-level counts of population 25 years and over. Etc.

If you need to interpolate many block-group statistics, you may also want to start by generating your own block-group-to-block-group crosswalk. E.g., for each intersection between a 2000 block group and 2010 block group, use 2000 block data and the NHGIS block-to-block crosswalk to estimate the proportion of the 2000 block group’s population that lies in the 2010 block group. Then you could re-use the smaller BG-to-BG crosswalk to interpolate each statistic you need to interpolate. If you want to use different guide variables for different statistics (e.g., households instead of population), then you’d have to add each of those proportions to your crosswalk.

NHGIS has plans to construct crosswalks like these, and to generate standardized time series of block-group source data, using the same techniques I described above. However, it may be a year or more before these products are available.


That makes sense!