I am trying to estimate hours worked per week and weeks worked per year (mean and median) for a subsample of current workers (restricted to specific occupations, industry and one state). I downloaded the CPS monthly basic file for the years of interest, restricted to ORG data (i.e. kept relevant MISH=4 or =8), and retained only those who are actively employed. All that works fine and I’ve got earnings per week for most of the sample. However, almost half the sample is missing data on usual hours worked per week (UHRSWORKORG) and 94% are missing data on weeks worked per year (WKSWORKORG). In both cases, the missing data is coded as 999, which means Not In Universe. This is technically incorrect, however, as these are all respondent data points from the 4th and 8th months in the sample; so they should have received the earners study questions.
I’m using data from 2020 (MISH=8) and 2021 (MISH=4 | MISH=8). Did something change in how these questions were administered during the pandemic?
The universe for UHRSWORKORG is “employed civilians 15+ in outgoing rotation groups that were paid by the hour. Excludes self-employed persons.” Therefore, in addition to excluding workers based on EMPSTAT and MISH, you should also use PAIDHOUR and CLASSWKR. Doing so for the 2021 BMS files results in about 25% of the sample being not-in-universe. The universe for WKSWORKORG is “civilians 15+ currently employed as wage/salary workers and in 2 (out of 8) rotation groups and is paid annually. Excludes self-employed persons.” Therefore, you also need to consider the number of months an individual appeared in the survey when examining the universe for the variable. Doing so should reduce the percent of your sample that is not in the universe. That being said, there are still cases that I would expect to be in universe based on the provided definition that have NIU codes. We are looking into this further and will provide an update when we know more.
As an alternative, you might be interested in using UHRSWORKT and AHRSWORKT (available in all BMS samples) or UHRSWORKLY and WKSWORK1 (available in ASEC samples) to estimate hours worked. These have fewer universe restrictions than the earner study variables. This page compares all of the mentioned hours worked variables.
Thanks for responding, Ivan. I checked my data to see if possibly sample respondents weren’t paid hourly or were self-employed (the former is pretty uncommon for this occupation). Of 107 respondents in the sample, 9 responded as self-employed and another 9 are paid on a salary basis, which still leaves 89 individuals who were part of the ORG earners study for whom there should be data on UHRSWORKORG. Data is missing for 24 of them.
Your response re: WKSWORKORG makes it sound like only those who are MISH=8 should have data on WKSWORKORG. Is that correct? It still doesn’t answer the question of why so many (92% of my sample who are MISH=8) are coded as NIU. But I can investigate whether possibly some people in the sample who I’ve got month 8 observations for didn’t respond in all of the other 7 months.
Thanks for continuing to investigate.
I previously mentioned that you should use PAIDHOUR to define the universe for UHRSWORKORG. However, UH_ERNPER_B1 = 1 is the correct way to define being “paid by the hour”. PAIDHOUR includes not only those who are paid by the hour, but also those who provided an hourly rate even though they were not paid by the hour. This is noted on the variable description page for UHRSWORKORG. Using UH_ERNPER_B1, I’m finding only 253 observations (0.45% of the sample) that are not-in-universe in a sample of all 2021 basic monthly surveys.
For WKSWORKORG, you will want to restrict your sample to UH_ERNPER_B1 = 6. Respondents in both MISH 4 or MISH 8 should be eligible for WKSWORKORG. With this, you should be left with 6,812 cases (16% of the sample) with not-in-universe values for the 2021 BMS sample. I hope to have more information about WKSWORKORG soon that will help decrease the number of these observations even further.
The CPS team took a closer look at WKSWORKORG, but unfortunately wasn’t able to find a conclusive reason why there are such a number of NIU observations. The codebooks just say the universe should be respondents who are paid annually (UH_ERNPER_B1 == 6). The question about how often they were paid was asked of all ORG respondents (i.e. employed, but not self-employed, civilians age 18+). However, that leaves 20-30 who have responses despite not meeting those criteria (not paid annually specifically) and 400-600 each month who do not have responses even though they fit the bill. This appears in the original data and is most likely the result of the Census Bureau applying a universe whose rules were not made (easily) available. Hope this information is useful.
Thanks for investigating. I agree with your conclusions and am glad it wasn’t due to something I was doing wrong.