CPS sample size April/June 2001 discrepancy

Peter · September 5, 2018, 4:47am

Hello,

I have a same question that was asked previously (sample size discrepancy between IPUMS CPS and Raw CPS :

I am comparing the IPUMS version with the raw CPS files in the Census FTP. It seems that the IPUMS has about 12000 households more than the raw CPS files for each month. I think this issue has not been resolved or documented on the IPUMS website yet. (Or am I just missing it?). What is the source of discrepancy and how to deal with this?

Thank you!

Peter

JeffBloem · September 6, 2018, 8:42pm

Thanks for following up on this question. We do have a bit more information to share. None of this allows for much clarity, but it does provide some additional explanation for what might be going on here. In short we suspect this is due to inconsistent inclusion of a SCHIP (State Children’s Health Insurance Program) oversample in this year, but documentation is both scant and contradictory.

A few details to note. First, from September 2000-August 2001 the basic monthly samples contain (or should contain) an over sample of about 12,000 households for SCHIP purposes. (see technical paper 63, revised, appendix J). This sample size increase was (according to the aforementioned technical paper), phased in by November of 2000. Second, it seems that not all of the data files between November 2000 and August 2001 actually contain this over sample. In fact, the only samples that appear to have those extra SCHIP households are April 2001, June 2001, July 2001, and August 2001. Finally, the CPS codebooks reflect the discrepancies in record counts that we see in IPUMS CPS data. Case counts for the original input data for April, May, and June 2001 match those listed in the sample-specific codebooks on page 3 under the heading “Technical Description.” This makes us think that, even though the technical paper suggests that all samples from November 2000-August 2001 should contain extra records, not all of them actually do. However, we haven’t been able to find anything in these codebooks to suggest a reason for these differing case counts among samples.

So, in conclusion the CPS documentation contradicts itself. The technical paper says samples between November 2000 and August 2001 should have over samples, but not all of them do. Further, sample-specific documentation reflects the case counts that appear in the actual files.

Topic		Replies	Views
CPS sample size in April and June 2001 CPS	1	621	December 9, 2016
Merging issues with the Sep 2000, Apr 2001, Jun 2001 and Feb 2002 basic monthly CPS data (IPUMS-CPS with NBER) CPS	1	589	December 29, 2016
Observations Do not Match Up NBER ASEC and IPUMS CPS ASEC for 2001 and 2014? CPS	2	335	June 6, 2017
CPS Small Sample Size April and June 2001: using stata [pweight=wtfinl] - sudden spike in educational attainment CPS	1	523	July 16, 2015
Why are there differences in the number of observations between IPUMS CPS and NBER data files? CPS	1	658	October 28, 2013

CPS sample size April/June 2001 discrepancy

Related topics