Why are not all the "not in the labour force" categories filled for the EMPSTAT variable?

I am using the EMPSTAT variable from the basic monthly samples of IPUMS-CPS (time period: 1997-2013) to distinguish the labour market states employment, unemployment and not in the labour force (NILF). In my (weighted) results, I do not get enough NILF people, compared to official statistics. I was therefore wondering whether there are NILF people missing from the basic monthly samples. In particular: Why are the NILF codes 30, 31, 33, 35 not filled?

Best regards


In regards to the NILF codes, take note of the discussion on comparability of the “NILF” codes in the EMPSTAT variable over time. In particular, in the mid-1990s there was a redesign of the type of information available on persons not in the labor force. Beginning in 1995 and onward, persons not in the labor force were grouped into three categories: “retired”, “disabled”, and “other”. This detail shouldn’t change the total number of persons “not in the labor force”, however. One detail to keep an eye on when trying to replicate official statistics is the method used for weighting the data in the official statistics. The WTFINL is most often used when using person-level IPUMS-CPS basic monthly data. Additoinally, we typically don’t expect to exactly replicate official statistics.

Dear Jeff,

Thanks a lot for your answer. I was simply surprised to see that the participation rate I compute for June 2010 with IPUMS-CPS is 73% (rather than 65% in official statistics), and the unemployment rate being 6.3% (rather than over 9%). But if I understand you correctly, there are no groups of the population (such as housewives) that are systematically excluded from the IPUMS-CPS sample?

Best regards


Hi Ronald:

Although we typically don’t expect to exactly replicate official statistics, we do expect to be closer than what you are reporting here. There shouldn’t be anyone who is systematically excluded from the IPUMS-CPS sample that isn’t systematically excluded in the BLS CPS samples. I’d suggest looking over how the official statistics calculate their figures and see if your analysis matches this methodology. I know that the BLS does do some seasonal adjustments when calculating their figures. I’ll aslo mention that while the persentage doesn’t match the weighted count between these two data sourses is pretty close: 14.6 million in the BLS and 15 million in IPUMS-CPS (16+ population only).


I’;ve done a ton of work with BLS data. Just sstarted with NHIS. There are 20 possible responses to the EMPSTAT question. Some are not mutually exclusive. Others do not conform to BLS methods. BLS says you are employed if you did any work for pay during the survey week. You are unemployed if you did no work for pay and actively looked for work during the three weeks preceding the survey week. (I may be wrong about the exact time periods. I have a question in with BLS asking for clarification.)

Importantly, some of the EMPSTAT responses refer to “the last week or two.” That does not conform to either BLS methodology. Also, response code 34 “Unemployed: looking or on layoff” actually conflates both labor force categories. If you’re looking for work you’re unemployed. If you’re on layoff you are counted as employed.

My guess about the “missing” variables is that they are being recoded to conform to the definitions in the DOINGLW2 variable (This variable is described on p.2 of the DRAFT 2017 NHIS Questionnaire - Sample Adult (Adult Identification). If I am correct, it would be helpfull to know the recoding methodology.

I trust the excellent NHIS staff to correct any of my misconceptions and misunderstandings.


Tony Lima

Professor Emeritus of Economics

First, I think it is important to note that this thread began with a question about EMPSTAT in IPUMS CPS, which does vary from EMPSTAT in IPUMS NHIS. However, as noted here, “The first digit of the coding system for EMPSTAT groups all adults (age 17+ in 1969-1981, age 18+ in 1982 forward) into one of four main categories: employed, with a job but not at work, unemployed, and not in the labor force. These categories correspond to main employment status categories recognized by the U.S. Bureau of Labor Statistics and are used in other sources such as the Current Population Survey and the U.S. census.”

The remained of my response is related to questions raised by Tony about EMPSTAT in IPUMS NHIS.

The description of EMPSTAT provides some insight into “employment status in past 1 to 2 weeks”; from 1969-1996, the reference period was the preceding two weeks and from 1997 forward, the reference period was the preceding week.

While there are many categories available over time for EMPSTAT there was a change in the methodology in the mid-90s. As such, it appears that within a given year, the categories available through IPUMS NHIS are mutually exclusive.

Additionally, according to the BLS, workers expecting to be recalled from temporary layoff are counted as unemployed whether or not they have engaged in a specific job seeking activity. EMPSTAT codes 31-34 encapsulate this population.

Finally, an individual’s EMPSTAT will be missing for the following reasons: they are not in the universe, or their employment status is unknown. While samples from 1969-1996 assign EMPSTAT values based on multiple questions, from 1997 forward, respondents were asked one general question to ascertain their employment status. Survey text from each year can be found here.

Many thanks to Ronald and Michelle for their patience and detailed explanations. I now have the data I need and understand the relationship with BLS data and standards. Now I just need to figure out what the data is telling me. :slight_smile:

As compensation, I offer a wine blog run by my lovely wife and me: http://CaliforniaWineFan.com.