Differential Privacy - Demo Data

Jeffrey_Bannon · November 6, 2019, 4:03pm

Since the Census differential privacy plan involves adding noise, shouldn’t the demo data include margins of error or standard errors for the estimates?

JonathanSchroeder · November 6, 2019, 4:41pm

First, to confirm, IPUMS has not chosen to omit margins of error or standard errors from our version of the demo data. That omission is original to the Census Bureau’s demo data.

I agree that if the Census uses DP for the final 2020 Census data, it would be valuable for the Census to include margins of error or standard errors. I recommend sharing that concern directly with the Bureau through their demonstration data feedback email address: dcmd.2010.demonstration.data.products@census.gov.

With the demo data, you can compute approximate margins of error based on the distributions of differences between the demo data and original data. This is an approximation because both the original data and demo data include some deliberate errors for disclosure avoidance, so differences between the two sources do not directly equate to errors in the demo data.

There are a few exceptions, though. The original 2010 data held several statistics “invariant” that the demo data does not. Specifically, total population, voting-age population, number of housing units, number of occupied housing units, and number and type of group quarters are all invariant for all summary levels in original 2010 summary files. As I understand, the demo data holds only total housing units and number and type of group quarters invariant for all summary levels. (Total population is invariant only for states, regions, divisions, and the nation.) This means any differences in total population, voting-age population, and number of occupied housing units represent errors in the demo data due to injected noise.

Topic		Replies	Views
Margin of error in census of Argentina INTERNATIONAL	3	201	April 12, 2023
Why do frequencies for IPUMS samples sometimes differ from official results? INTERNATIONAL	1	249	January 22, 2013
What explains the discrepancy between 1940 Census IPUMS data and published summaries? USA	2	487	January 27, 2020
Replicate Weights Margin of Error USA	1	580	September 24, 2020
Block-level population and housing unit mismatch in 2020 Census - differential privacy? NHGIS	4	150	January 16, 2025

Differential Privacy - Demo Data

Related topics