I am try to use the variable colsigev which looks like this in stata in 2018:
. tab colsigev if year == 2018
ever had colonoscopy,
sigmoidoscopy, or both Freq. Percent Cum.
niu 55,172 75.75 75.75
colonoscopy 8,738 12.00 87.75
sigmoidoscopy 219 0.30 88.05
both 1,226 1.68 89.74
neither 6,902 9.48 99.21
unknown - refused 50 0.07 99.28
unknown - not ascertained 422 0.58 99.86
unknown - don’t know 102 0.14 100.00
Total 72,831 100.00
I am using the neither category to denote “no” But if you tabulate it in 2021, there is no neither category. You get the following:
tab colsigev if year == 2021
ever had colonoscopy,
sigmoidoscopy, or both Freq. Percent Cum.
niu 25,315 67.07 67.07
colonoscopy 11,040 29.25 96.32
sigmoidoscopy 163 0.43 96.75
both 1,149 3.04 99.80
unknown - refused 4 0.01 99.81
unknown - not ascertained 2 0.01 99.81
unknown - don’t know 70 0.19 100.00
Total 37,743 100.00
Something seems drastically off. The percentage of people with colonoscopies also increases dramatically. Does anybody have an insights into this?
The differences you are observing in the response options for the COLSIGEV variable, and in distributions of responses, are a function of different universes.
In 2018, the universe of COLSIGEV is sample adults age 40+. Respondents are asked if they have ever had a colonoscopy, which is a yes/no question (COLEV), and if they have ever had a sigmoidoscopy, also a yes/no question (SIGEV). IPUMS used responses to COLEV and SIGEV to build the COLSIGEV. People who responded no to both COLEV and SIGEV are coded as COLSIGEV==4 (neither).
In 2021, the universe of COLSIGEV is more restrictive; it includes sample adults age 40+ who have either had a colonoscopy or sigmoidoscopy or both, as reported in COLORECTEV. Respondents are asked if they have ever had a colonoscopy or sigmoidoscopy, which is a yes/no question (COLORECTEV). Respondents who say yes are then asked which (or both) of the procedures they have had (COLSIGEV). Respondents who say no to the COLORECTEV yes/no question are not in the universe of COLSIGEV.
I hope this is helpful in understanding how the COLSIGEV variable is constructed and differs across samples. Please follow up with any additional questions.
Thank you Isabella. This makes a lot of sense.