One of my regression samples is of foreign-born women aged 21 to 26 years old who were asked “Have you ever received a Pap smear/cervical cancer examination” in the survey years of 2013, 2015, 2018, 2019, and 2021. In the survey year of 2013, the question ID is AAU.530_00.010 in the qadult.pdf
I’m curious as to why a large number of women aged 21 and above in these survey years are coded as “N.I.U” in the IPUMS NHIS data, considering their eligibility for the cervical cancer screening test and being in the universe of females above 18 years old. I hope someone can help me understand if there’s a specific reason or other relevant endogenous factors that women above age 21 may be skipped on this question.
Question ID AAU.530_00.010 in the NHIS corresponds to NHIS variable APSPAP. The question asks “Have you had a Pap smear or Pap test DURING THE PAST 12 MONTHS?” Using the IPUMS NHIS - NHIS variable concordance tool, I see that NHIS variable APSPAP corresponds to IPUMS NHIS variable PAPHAD1YR. The universe of the variable female sample adults age 18+. The sample adult in each household sampled by the NHIS is the one adult randomly selected to answer a set of more detailed questions than the questions asked of the rest of the household members. Sample adults are identified in IPUMS variable ASTATFLG. Female respondents age 18+ who were not randomly selected as the sample adult in their household are not in universe for PAPHAD1YR. You can read more about the sampling design and sample adults in the 1997-2018 NHIS in this blog post. Note that the NHIS survey design changed between 2018 and 2019, and only sample adults and sample children are included in the data file beginning in 2019.