I found that there is a substantial difference in the number of unique blocks in the block-level crosswalk file (e.g., Blocks → Blocks: GISJOIN Identifiers for 1990 to 2010) and the decennial census.

To be concrete, let’s take California. The crosswalk file has 399,721 unique blocks for 1990 while the 1990_STF1 for the population variable has 287,051 unique blocks. What could explain the difference? More importantly, if a block is in the crosswalk data but not in the 1990_STF1 data, is it fair to assume that those blocks had zero population in 1990? I wasn’t entirely sure whether zero population assumption is reasonable given that there are quite a bit of blocks with zero populations in the 1990_STF1 file as well…

Any advice would be highly appreciated. Thank you!

The 1990 STF1 file in NHGIS has no records for census blocks with zero population and zero housing units. Those records do show up in the crosswalk file, though, so that’s why you’re observing so many more 1990 census blocks in the crosswalk vs. the STF1 file.

The 1990 STF1 block records with zero people are census blocks that contain at least one housing unit. If you’re only looking at a population table, you’ll see zero counts for the those blocks. But, if you downloaded a housing unit table, you’d see non-zero counts for those blocks.

