Comparing the variables IND and CLASSWK of Indonesian data between 1980 to 2010,
I realized the number of observation for NIU(not in universe) is perfectly aligned EXCEPT FOR 2010:
IDN(2010) [IPUMS-I: descr: ID2010A_IND]
CLASSWK [https://international.ipums.org/international-action/variables/CLASSWK#codes_section]
Since IDN and CLASSWK assume the same universe and CLASSWK is from a follow-up question of IDN, the exceptional discrepancy in 2010 does not seem very intuitive.
Can I know why there is such a difference only in the 2010 sample?
My apologies for the delay in response. It took a bit of time to figure out the source of the discrepancy.
In 2010, a number of people who were not employed or temporarily away from a job (unexpectedly) have valid values for CLASSWK. All of these additional responses are persons who are self-employed, and categorized under the “own account, with temporary/unpaid help” in the detailed version of the variable. These persons are out of universe for IND, as expected.
Thank you for the reply.
Follow up clarifying question: I am trying to investigate (un)employment rate change in Indonesia between 2005, 2010, 2015.
Does this imply the value NIU for IND variables in 2005(and 2015) and 2010 defined differently?
e.g.
Again, number of observation for Value=999 of ID2005A_IND variable is perfectly aligned with Value=9 of ID2005A_CLASSWK, with observation of 635,630.
(IPUMS-I: descr: ID2005A_CLASSWK).
So a fair dynamic comparison can only be done using *_IND variables but not *_CLASSWK variables?
You are correct that the NIU value is defined differently between these three samples (see the universe tab of IND for the universe definition, or the universe tabs for each of the source variables associated with IND).
Given these universe differences; you will need to define a consistent analytical subsample population that has data available for your variables of interest in each of the three sample years. If you restrict your analysis to a subsample defined by a set of consistent characteristics that are devised with the universe coverage of these variables in mind, then you should be able to use both the IND and CLASSWK variables in your analysis.