Upon downloading datasets for 1980, 1990, 2000, 2010, and 2012. We reviewed the data and noticed that, on average, a third of respondents each year list “0” as their income under the incwage variable.
Is there an explanation for why so many people are listing zero.
We are trying to analyze the racial breakdown for a given occupation by income. We are considering dropping (not including) all data points that list “0” as their incwage. We are concerned about how this will skew the data.
Can you give any more information about why respondants list an occupation but mark their incwage as zero? Do you have any guidelines around including or not including these data points and the affect that will have on the analysis?
Lastly we are applying the perwt weight to our tabulations. If we were to drop all datapoints that list “0” as their income, what effect would applying perwt have on the data?
Thanks so much,