Discrepancy in Census CPS file and IPUMS file?

paul_kiernan · May 8, 2025, 8:47pm

Hi folks, I’m trying to track CPS response rates by counting PERNUM=1 (in IPUMS extracts) and PULINENO=1 in CPS microdata files downloaded directly from Census. According to the IPUMS extract, there were 40,092 responses in March 2025. But according to the CSV file downloaded from Census, there were 39,568.

Am I doing something wrong, or is this a real discrepancy that has an explanation?

Ivan_Strahof · May 15, 2025, 5:37pm

Thank you for your patience while we looked into your question.

IPUMS extracts contain all of the person and household records as the files downloaded directly from Census (records of non-interviewed households can be obtained by requesting a hierarchical extract). Since PERNUM is created by IPUMS, I’m assuming that you’re using the line number variable PULINENO (provided by IPUMS in the variable LINENO) to obtain counts in the original Census file and then comparing this number to PERNUM in your IPUMS extract. The difference in the number of records that you are seeing therefore is explained by households that have a person with PERNUM = 1, but no one with PULINENO = 1. When accounting for these households, both files will show 40,092 respondent households in the March 2025 Basic Monthly Survey (BMS).

PERNUM is created by sequentially numbering persons within each household for each monthly sample as they appear in the microdata. Meanwhile, PULINENO refers to the line in the household roster a particular person is enumerated on across their time in the panel. Since the CPS sample selects households (i.e., physical housing units) rather than individual people to be surveyed, people who exit the household (by moving out or through death) during the CPS panel are not recorded in subsequent survey months. In such cases, their PULINENO value (unlike PERNUM) is not reassigned to a different household member in the following survey month. The household will therefore have no members with that particular PULINENO value. The same procedure applies if all residents of the household move out and are replaced with a new family that moves in. This can result in households with no members who have PULINENO = 1.

Additionally, there are cases of households where no one has ever been assigned PULINENO = 1 across their time in the panel (including households in their first month of enumeration). I was unable to find information on what this might indicate and suspect that these are cases where an individual was initially enumerated and assigned a PULINENO value, but was subsequently removed due to not satisfying the necessary conditions to be included in the household roster. Without more information from the BLS or the Census Bureau however, I cannot confirm whether this is the case.

Note that the BLS provides detailed information regarding CPS response rates on the CPS methods website.

paul_kiernan · May 15, 2025, 7:18pm

Thanks Ivan. So it sounds like, if I want to report in a news story the number of survey respondents in a given month, I’m better off using PERNUM=1. Is that right?

Ivan_Strahof · May 16, 2025, 6:03pm

For clarification, survey respondents can refer to households who responded to the CPS or to persons represented in respondent households. Restricting your analytical sample by PERNUM=1 will give you the number of responding households, which will be smaller than the number of persons represented in the survey as households can contain more than one person.

It is much more straightforward to obtain this value by reviewing the HHINTYPE codes tab. By selecting the case-count view on the codes tab for this variable, you can obtain the total number of interviewed and non-interviewed households for each sample in your data cart (or for the default samples if your cart is empty). In the screenshot below, you can see that 40,092 households were interviewed, 19,958 had a type A non-interview (for reasons such as refusal to participate and temporary absence), and 9,336 had a type B (vacant or occupied by persons ineligible for interview) or type C (housing units that were demolished, converted to storage or business use, or included in the sample by mistake) non-interview in March 2025.

You can also review the sample size page to obtain the total number of people for which data was provided in a given sample. While the sample sizes provided on this page combine both interviewed and non-interviewed households, person records are located only in interviewed households. Non-interviewed households have no person records. The screenshot below shows that information was collected on 93,649 persons from among the 69,386 total households of which 40,092 were interviewed and included at least one person.

Topic		Replies	Views
Why are there differences in the number of observations between IPUMS CPS and NBER data files? CPS	1	659	October 28, 2013
What happened to PULINENO CPS	2	453	June 7, 2021
Has the IPUMS fewer observations than CPS basic files?	3	315	June 30, 2020
Identifying individual households - sample size discrepancy in 2000s? CPS	1	339	September 2, 2014
How to interpret Number of Records CPS	1	187	October 11, 2023

Discrepancy in Census CPS file and IPUMS file?

Related topics