Regressions with "Perturbed" Variables

Dear All,

I would like to learn about what predicts the duration of unemployment (time it takes people who got laid off to find a new job), using monthly CPS data.

To that end, in a nutshell, I am planning to regress the duration of unemployment (DURUNEMP) on variables like AGE, SEX, OCC, IND, and EARNWEEK (plus some other variables from the Earner Study/Outgoing Rotation Groups).

In the Census CPS documentation, it says that some of my independent variables, e.g., AGE, “were altered, or ‘perturbed’, in the public use microdata files to further protect the confidentiality of survey respondents.” Should I be worried about how this affects my estimates (ignoring for a second that they are not identified in the first place)?

Many thanks for your guidance!

Hi Chantal,

Perturbed data will introduce some measurement error into your results, attenuating the effects of any relationships towards zero. An example of how purposefully introduced measurement error in Census data affected outcomes is provided in Cleveland et al. (2012). For privacy reasons this is simply a current reality when working with public use microdata. I hope that information helps!

1 Like