Calculating number of workers earning below a threshold

Nick_Wells · March 22, 2021, 4:00pm

I’m pretty new to analyzing IPUMS data and want to make sure I’m using the data correctly. I’m trying to get a read on the number of people who report weekly earnings above and below a certain threshold (in this case, $500), broken down by sex. For this, I’m using CPS monthly data since 2019. My extract includes the following variables: SEX, LABFORCE, EARNWT and EARNWEEK, and below is my code in R.

cps_data %>%
	mutate(EARNWEEK_2 = as.numeric(as.character(cps_data$EARNWEEK))) %>%
	mutate(earn_bins = ifelse(cps_data$EARNWEEK_2 <= 500, 'under 500', 'over 500')) %>% #bin weekly earnings into those making above and below $500
	filter(LABFORCE == 2 & EARNWEEK != 9999.99) %>% #exclude those not in the labor force and EARNWEEK NIUs
	group_by(YEAM = paste(YEAR, MONTH, sep = '-'), SEX_factor = as_factor(SEX), earn_bins) %>% #group by year/month and sex
	summarize(n = sum(EARNWT), EARNWEEK_avg = weighted.mean(EARNWEEK, EARNWT)) #summarize using EARNWT

First, does my methodology seem sound? I’m surprised at the monthly variability of the data. It’s not out of the realm, but choppier than I would have guessed.
Second, I’m wondering why the monthly totals don’t match other sources. For example, the BLS has some 75 million women aged 16+ in the labor force as of February, and my analysis has only 65 million. Is this discrepancy likely due to the way CPS data categorizes workers?

Grace_Cooper · April 5, 2021, 9:02pm

The method you are using to estimate average monthly earnings looks good to me. The monthly variability I calculated using your method (a swing of about 50) is within a reasonable range that I would expect, given that the 25th and 75th percentiles of the distribution of EARNWEEK for your data are 420 and 1076, respectively.

Population estimates based on a subset of people who were included in the Outgoing Rotation Group (Earner Study) will differ from BLS estimates because of eligibility issues. The universe for Earner Study questions is more restrictive; you can identify these cases with ELIGORG==1 (this is equivalent to EMPSTAT in (10,12) & CLASSWKR in (22,23,25,27,28)–note that this excludes all self employed (both incorporated or unincorporated), military, and unpaid family workers). Because this is more restrictive than the universe used in BLS estimates of people in the labor force, you will get lower population estimates.

Topic		Replies	Views
Median Hourly Earnings Based on EARNWT CPS	4	466	April 26, 2021
CPS and Outgoing Rotation Groups CPS	2	227	July 20, 2023
What is a good inc variable in monthly CPS? Using 92-12 data earnweek and hourwage are mostly NIU, not sure why CPS	2	601	May 6, 2016
Need guidance in interpreting outgoing-rotation data CPS	1	482	August 5, 2015
Can't Properly Define Monthly Income Averages CPS	1	8	April 15, 2025

Calculating number of workers earning below a threshold

Related topics