Why are there 53959 and 1447 cases in 1980 and 1990 census data that have 0 for WKSWORK2 and yet nonzero and nonmissing value on at least one of the following labor income components (incwage, incbus, incfarm)? Are they simply data error, or is there any hidden information that I should be aware of?
(I have already dropped all cases whose values on the three income variables are allocated.)
I checked census and ACS data from 1970 to 2018, and found this issue only for 1980 and 1990. Military service should not be the explanation because the questionnaire explicitly stated that military service counts in weeks worked.
The universe for the income variables you referenced is persons aged 16+ (14+ in 1970). In contrast, the universe for WKSWORK2 is persons aged 16 + (14+ in 1970) who worked last year. While it may seem improbable that a person would report income without working, the skip patterns for the questionnaire do allow for this. The vast majority (INCWAGE in 1980 is the lowest income source/year combination at 98.8%) of people who did not work in the past year report $0 of income from any of these sources. INCFARM and INBUS are not available after 1990, so what seems more odd is that this is not an issue in 1970. The 1980 (and 1990) questionnaire asked considerably more information about income sources. It is possible that enumerator instructions or other descriptive text in 1980 were unclear about the additional income sources in 1980 and better clarified in 1990, but that is difficult to infer from the documentation.
Thank you very much for the information. By
The 1980 (and 1990) questionnaire asked considerably more information about income sources. It is possible that enumerator instructions or other descriptive text in 1980 were unclear about the additional income sources in 1980 and better clarified in 1990, but that is difficult to infer from the documentation.
do you mean the enumerator instructions might have misled respondents who were not working and thus had no labor income to report their nonlabor income (i.e. “additional income sources”) as labor income (i.e. incwage, incfarm and incbus)? Is that what you meant “literally”?
That is close, but not quite what I meant.
As you noted, there are no people with 0 values of WKSWORK2 who report income from these sources in 1970. The unexpected non-zero values you highlight in your initial post are much higher in 1980 than 1990. I cannot find anything in the underlying data or universe statements that explain this difference. However, the changes to the questionnaire should be noted.
One possibility is that the additional income sources in 1980 were confusing to respondents, so they may have reported income in the wrong category or been unclear about what to report as income in general. In addition to more income sources in the questionnaire, the text preceding the actual items asking about these income sources changed over the range of years you ask about as well (INCFARM question, INCBUS question, INCWAGE question). It is possible that changes to the questionnaire and/or enumerator instructions in 1990 made it easier to understand how to report income (again, the number of people reporting farm, business, and wage income who didn’t work in 1990 is lower than in 1980). I have not found research or documentation on how the questionnaire changes might affect the data, but do want to highlight this change as something to consider as you determine how to define your analytical sample and interpret results.