Extremely higher number of observations for birth cohorts that are multiples of 5 in Indonesian 2010 census

Hi, I am working with Indonesian 2010 census and find that the number of observations born in birth years which are multiples of 5 is extremely larger than the other years.

For example, the number of observations born in 1959 is 231,827, while it’s 367,886 in 1960 and 205,861 in 1960. Similar pattern is found for other years, such as 1965, 1970.

I checked the person weight and found that person weight is the same for all observations in 2010 census.

I am attributing this to mis-reporting problem (that people may round up/down their birth years), is there any other specific reason? And is there any common way to deal with this problem?

Thanks a lot!


Your explanation is correct. As stated on the comparability tab for BIRTHYR: “The variable may suffer from respondent rounding (or “digit preference”) in some samples as seen with AGE, especially when birth year is based off of or asked in tandem with age. Therefore, birth years in five or ten year segments previous to the census year (e.g. 1958 for a 2008 sample) may be over-represented.” Unfortunately, there is not a great way to deal with this issue. Data quality is fundamentally reliant on accurate reporting by respondents. For many people substantial measurement error influences recall of the exact month and year of birth.