I have noticed that INCTOT is coded as “unknown/missing” as soon as one of its unharmonized source variable is also coded “unknown/missing”. However, the converse is not true. Is it because some unharmonized source variables are not available through IPUMS (but they make INCTOT being coded as “unknown/missing” when they are too) or am I missing something?
Relatedly, I’d like to compute a proxy for municipalities’ GDP per capita by summing up INCTOT across all individuals concerned (I know from experience that data at the sub-national level published by the Brazilian Instituted of Statistics – the IBGE – is not reliable, so I’d rather compute this myself). Do you think that I should:
-Replace the unknown/missing values on INCTOT by the non-missing values on the source variables (in order to avoid underestimating municipality’s GDP per capita)?
-Drop those individuals concerned by “unknown/missing” values on INCTOT from the computation of the municipality’s population?
-Do nothing about this?
Lastly, can you please tell me why there are no “unknown/missing” values on INCTOT for the years 2000 and 2010? I guess that some respondents also failed to answer questions about their income in these years.
Thank your for your help!
PS: notice that the reference period for INCTOT in 1991 is not August 1990 (as written in the Questionnaire Text page), it is August 1991.