Our research team is using Malaysian Census data from 1970-2000 to generate cohort sizes. We noticed that questions about the date of birth were listed among the Questionnaire Texts linked to the age variables of all census years. However, the date of birth variable is only available to download for the 1991 Census.
We are finding evidence of some clumping around ages that are multiples of 5 (and especially 10) in the 2000 Census. We recently acquired 2010 Census data directly from the Malaysian Statistical Office, and the clumping issue is much worse. Could you tell us if data from the date of birth question was used to construct the age variables in 2000, or if there was any cleaning done on your end?