Hello,
I’ve been using the Longitudinal Tract Database (LTDB) to convert 1990 Census tracts to 2010 Census tracts for some time now. In the LTDB, I’ve been handling median values by applying base values, multiplying the interpolation weights, collapsing the data, and then unweighting.
Now, I’m working on converting 1990 Census blocks to 2010 Census tracts using NHGIS crosswalks, and I wanted to confirm best practices for handling median values at the block level.
Here’s the process I’ve been using for tracts:

Weighing the Medians by Base Variables: For each median variable, I apply a base value (like population). If the median value is missing, the base variable is set to missing as well to avoid incorrect calculations during the unweighting step. In my code, I multiply the median value by the corresponding base value to generate a new weighted mean variable

Applying Interpolation Weights: Once the medians are adjusted by their base values, I apply the interpolation weights provided by the crosswalk. This ensures that the data is adjusted to reflect the proportion of each block or tract contributing to the target geography. For each variable (counts, medians, and base values), I multiply the values by the interpolation weight.

Collapsing into 2010 Census Tracts: After applying the interpolation weights, I collapse the data into 2010 census tracts by summing the weighted values. This step aggregates the data from the source zones (1990 tracts in my case) into the target zones (2010 census tracts). I also preserve missingness for any variables that were entirely missing before collapsing.

Unweighting the Median Variables: Finally, I “unweight” the median values by dividing them by the corresponding base values to obtain the final weighted medians. This step normalizes the medians based on the previously applied base values, ensuring that the aggregated values are properly scaled.
I’m wondering if I can follow a similar process with the NHGIS crosswalk for blocktotract conversions. Specifically:
 Can I apply the interpolation weights provided by NHGIS in the same way?
 How should I approach collapsing and unweighting the blocklevel data?
 Are there any key differences or considerations when working with blocklevel data for medians?
Thank you in advance for your insights!
Best,
Aniket
 Can I apply the interpolation weights provided by NHGIS in the same way?
I see no conceptual issues with your proposal. It should work the same for blocklevel medians as it does for tractlevel medians, and in fact, starting from blocklevel medians would produce more accurate results. (As I understand, your method basically works on an assumption that all census responses within a source zone have a characteristic equal to the zone’s median–e.g., all people in a source tract have an age equal to the tract’s median age–and this assumption should be more accurate for blocklevel medians. Individual blocks have much smaller populations than tracts, so blocklevel medians should better correspond to individuallevel characteristics than tractlevel medians.)
 How should I approach collapsing and unweighting the blocklevel data?
You could use the same workflow that you describe for tracttotract crosswalks and just replace the tracttotract crosswalk with an NHGIS blocktotract crosswalk. The NHGIS website provides these general instructions for using the block crosswalks, and these instructions roughly mirror how you’ve described your process for tracttotract allocations. If that’s still not clear, please let me know.
 Are there any key differences or considerations when working with blocklevel data for medians?
The biggest issue is that, to my knowledge, the 1990 census summary files didn’t provide medians for blocks, and no source provides “longform data” (covering income, education, marital status, nativity, etc.) for blocks.
For longform data, you’d need to start from block groups, block group parts, or census tracts, as explained in this section of the Crosswalks page.
For medians of 1990 shortform characteristics (like median age), you could still use blocklevel data, but you wouldn’t be able to start directly with a median statistic. Instead, you could use one of the two strategies I suggested in this earlier forum post, which I’ve copied here:
 Use means instead of medians, and apply crosswalks separately for each mean’s numerator and denominator. E.g., to compute per capita income, you could estimate “aggregate income” and “total population” separately using the crosswalk weights, and then divide one by the other.
 Start with a table of counts broken down by the value of interest, e.g., housing units by home value, and use the crosswalk to estimate each count in the table. Then estimate the median from the frequency distribution, for which there are various online guides .
1 Like