Can anyone please tell me what a "." value in a stata file mean for a panel respondent? Indonesia Dataset


#1

I am trying to look at migration trends between 2000 and 2010 and when I merged the two samples for the respective years, Curiously observations appear as follows (see Image 1)

the serial and pernum identifies the dataset uniquely. The above screen shot is just a quick browse of households with serial number 1000. But there are about 2.3 crore observations.

id2010a_di~5- district 5 years ago asked in 2010

id2000a_re~r district 5 yaers ago asked in 2000

Question is- What does the “.” under the variable id2010a_di~5 really mean? For every individual who has a value for variable id2000a_re~r there are no corresponding values-represented by"."- for id2010a_di~5 and vice versa. It cannot be the case that there has not been any migration from 2000 to 2010. So I am missing some link. Is it the interpretation of the “.”?

I used count if (id2010a_dist5!=id2000a_reg5yr) & id2010a_dist5!=. & id2000a_reg5yr!=.

and it said 0. Basically I expected that this would give me number of people who migrated between 2000 and 2010.

Help Needed.

Thanks,


#2

The IPUMS-International microdata is cross-sectional, not panel data. As a result, the sample of households in the 2010 Indonesian Census is distinct from the 2000 Census sample and the SERIAL variable identifies households uniquely within a sample only, not across samples.

This means that a Stata value of “.” signals a question was not asked in that sample year’s questionnaire. Specifically, ID2000A_REG5YR was collected in the 2000 Indonesian Census and ID2010A_DIST5 was collected in the 2010 Indonesian Census. Respondents in the 2010 sample are not asked where they lived 5 years prior to the 2000 Census, so this value will be missing in your data extract.

Hope this helps.