What is the logic behind INCTOT being coded “9999998 = Unknown/missing" in the Brazilian Censuses?


I have noticed that INCTOT is coded as “unknown/missing” as soon as one of its unharmonized source variable is also coded “unknown/missing”. However, the converse is not true. Is it because some unharmonized source variables are not available through IPUMS (but they make INCTOT being coded as “unknown/missing” when they are too) or am I missing something?

Relatedly, I’d like to compute a proxy for municipalities’ GDP per capita by summing up INCTOT across all individuals concerned (I know from experience that data at the sub-national level published by the Brazilian Instituted of Statistics – the IBGE – is not reliable, so I’d rather compute this myself). Do you think that I should:

-Replace the unknown/missing values on INCTOT by the non-missing values on the source variables (in order to avoid underestimating municipality’s GDP per capita)?

-Drop those individuals concerned by “unknown/missing” values on INCTOT from the computation of the municipality’s population?

-Do nothing about this?

Lastly, can you please tell me why there are no “unknown/missing” values on INCTOT for the years 2000 and 2010? I guess that some respondents also failed to answer questions about their income in these years.

Thank your for your help!

PS: notice that the reference period for INCTOT in 1991 is not August 1990 (as written in the Questionnaire Text page), it is August 1991.

The converse actually does appear to be true. Be careful with the coding of the unharmonized source variables in the 1980 sample, as most of them actually have a value of ’ 9999999’ for the ‘unknown/missing’ value (please refer to the Codes page for the relevant unharmonized variables listed in this previous answer). With that in mind, all respondents with a value of “unknown/missing” for INCTOT do indeed have a value of “unknown/missing” for all unharmonized source variables.

INCTOT does not have any values of “unknown/missing” for the years 2000 and 2010, because the unharmonized source variables do not have values of “unknown/missing.” While that is likely not the most satisfying answer, there does not appear to be any indication of “unknown/missing” income values in the source data provided by the Brazil Census. Thus, while it is possible that some respondents did not report their income in 2000 and 2010, there is no way to detect this with the available data. Note, however, that less than 1% of respondents in the Universe actually had a value of “unknown/missing” for INCTOT in the 1980 and 1991 surveys.

Finally, you appear to be correct about the reference year in the source document. We will correct this.

Hope this helps.