Why are there differences in the number of observations between IPUMS CPS and NBER data files?

I dowloaded the cps 2001 files through the IPUMS-CPS system and also from the NBER web page. The total number of observations in the first case is 1,712,631 while the NBER files report 1,742,243. Does these differences affect any descritive and regression analysis?

Thank you.

It looks like you are getting these numbers from the NBER FTP website under Basic Monthlies for 2001, and the IPUMS-CPS All Samples page. If you compare the two lists of counts (P: counts from the IPUMS page and “Record Counts” from the NBER page) you will see that, not only do the totals not match (as you pointed out), but none of the individual month counts match either. This is for a couple of different reasons.

First, the numbers on the IPUMS page include both basic and supplement samples, where the NBER counts are only for basic monthly data. This difference is especially noticeable in March. Because the March supplement includes an over-sample, which would not show up in the basic monthly count found on the NBER site. This leads the IPUMS March file to have many more person records (because it includes both the basic and supplement respondents).

Also, the NBER files list records for non-response households as a separate record, but because a non-response household does not have any person records, IPUMS does not count those records in the P: row. This leads NBER to have higher record counts in some months.

All of these differences add up to the total difference that you identified. If you were to download the supplement files from NBER (or only count basic monthly respondents from IPUMS-CPS) and eliminate all non-response households, the number of records between the two data sets would be equal.

I hope this helps.