Census samples: sampling design and data quality


Are the micro-data 5% Census samples drawn entirely at random? Or alternatively, does the randomization condition on data quality, upweighting observations with unaltered responses? I ask because I would like to understand whether summary statistics of the data quality flag variables in the Census sample micro-data are informative about the imputation rate of the Census variables overall in the corresponding aggregate Census Summary File datasets.




The sampling procedures used to produce the public use samples have varied over time. Detailed descriptions of these sampling procedures can be found on the IPUMS-USA website here. Each sampling procedure includes some element of randomness with additional efforts made to ensure represtetiveness, however I do not believe data quality or completeness has ever been used as a condition for selection.

