Differential Privacy - Demo Data

Since the Census differential privacy plan involves adding noise, shouldn’t the demo data include margins of error or standard errors for the estimates?

First, to confirm, IPUMS has not chosen to omit margins of error or standard errors from our version of the demo data. That omission is original to the Census Bureau’s demo data.

I agree that if the Census uses DP for the final 2020 Census data, it would be valuable for the Census to include margins of error or standard errors. I recommend sharing that concern directly with the Bureau through their demonstration data feedback email address: dcmd.2010.demonstration.data.products@census.gov.

With the demo data, you can compute approximate margins of error based on the distributions of differences between the demo data and original data. This is an approximation because both the original data and demo data include some deliberate errors for disclosure avoidance, so differences between the two sources do not directly equate to errors in the demo data.

There are a few exceptions, though. The original 2010 data held several statistics “invariant” that the demo data does not. Specifically, total population, voting-age population, number of housing units, number of occupied housing units, and number and type of group quarters are all invariant for all summary levels in original 2010 summary files. As I understand, the demo data holds only total housing units and number and type of group quarters invariant for all summary levels. (Total population is invariant only for states, regions, divisions, and the nation.) This means any differences in total population, voting-age population, and number of occupied housing units represent errors in the demo data due to injected noise.