I read Eugene Ludwig’s article on Politico where he says '“If you filter the statistic to include as unemployed people who can’t find anything but part-time work or who make a poverty wage (roughly $25,000), the [unemployment] percentage is actually 23.7 percent”
I tried to replicate this using the CPS data, 2024 data files are here
https://www2.census.gov/programs-surveys/cps/datasets/2024/basic/
Here Ludwig describes the approach. Sadly what is described in the PDF does not map 1-to-1 to the technical document. When I tried prwkstat to be either 6,8,10,12 or hefaminc for less then 6, I get around 30% unemployment which is higher than Gene Ludwig’s result. I am using the CSV file for this.
I was curious if there are any codes, scripts available what GL describes. If so I’d love to see some working examples.
Thanks,
It sounds like you are trying to replicate estimates from an article (not linked in your post) based on the methodology (linked in your post) and running up against some discrepancies. As a starting point, I see that you are using the original Census Bureau CPS files for your replication work. Note that IPUMS CPS uses different variable names and codes, so I would not expect the instructions in the PDF that you linked to exactly match the Census Bureau data file. Based on your post and a quick review of the documentation, I can see a number of likely reasons that your estimates do not match those of the author.
- The author describes different approaches for wage/salary workers as opposed to self-employed workers, and they use different samples of the CPS for these estimates. They use variables from the Outgoing Rotation Groups (ORG) (1/4 of the sample in any given month) for wage and salary workers, and the ASEC supplement (released as the March supplement file) for self-employed workers (who are excluded from the ORG data). The data file you linked to is the 2024 basic monthly data (which includes the ORG data but will not include the ASEC data). This presentation provides a brief overview of the different types of data in the CPS.
- You have listed two variables in your post (prwkstat and hefaminc). It is possible that you have included additional variables that you don’t describe here, but work status and income are a relatively small subset of the variables that the author describes using in their documentation, particularly because they use both the ORG and the ASEC samples.
- Your income variable appears to be annualized family income from the BMS. The author is not using family income from the BMS. They are annualizing weekly earnings from the ORG (IPUMS variable EARNWEEK or EARNWEEK2) for wage/salary workers, and using the ASEC for self-employed workers–specifically the variables for wages/salary (IPUMS variable INCWAGE), non-farm business (IPUMS variable INCBUS), farm-income (INCFARM), and Social Security income (IPUMS variable INCSS).
- It isn’t clear to me how you weighted your analyses. Note that because the author uses both ORG and ASEC data, you will need to create a new weight that varies depending on the sample where each record originates.
Using IPUMS CPS I followed the logic laid out in the PDF that you linked for the 2022 and 2024 ASEC and ORG samples, I estimate 23.4% unemployment, which is pretty closely to the projected result that you quote (I didn’t see that in the documentation, but assume it is in the article). My Stata code is below for reference in case this is helpful for following logic described in the PDF. Note that I didn’t dig into the concepts presented–I simply tried to confirm that I could get reasonably close to this statistic when using the IPUMS CPS variable names and guidelines from the documentation. Additionally, I performed only a perfunctory recode of variables for this quick exercise and didn’t replicate the full methodology; specifically, I didn’t account for inflation, perform the linear interpolation in section B3, or create a weighted average based on class of worker (wage/salary versus self-employed) as described in Section C.
#delim;
clear;
set more off;
qui do cps_00166.do;
*definition of functional unemployment from PDF shared;
*modifies BLS U-3 unemployment rate in 2 ways;
*full-time work (35+ hours) or part-time with no desire for full-time job;
*earn at least 25K annually;
******U-3, reference IPUMS FAQ post on using IPUMS CPS variables to calculate U-3: https://blog.popdata.org/alt-measures-unemployment/;
gen u3numerator = 0;
replace u3numerator = 1 if empstat == 21 | empstat == 22;
gen u3denominator = 0;
replace u3denominator = 1 if empstat >= 10 & empstat <= 22;
tab u3numerator u3denominator;
******Differentiate self-employed from wage-employees;
gen classwkr_new = .;
*self-employed;
replace classwkr_new = 1 if classwkr == 13 | classwkr == 14;
*wage/salary;
replace classwkr_new = 2 if classwkr >= 22 & classwkr <= 28;
******wage-employees use ORG data;
*full-time workers or those working partime by choice;
gen fulltime = 0;
*ft workers;
replace fulltime = 1 if wkstat >= 10 & wkstat <= 13 & classwkr_new == 2;
*pt by choice workers
replace fulltime = 1 if (whyptlwk == 30 | whyptlwk == 50 | (whyptlwk == whyptlwk >= 90 & whyptlwk <= 111) | (whyptlwk >= 121 & whyptlwk <= 124)) & classwkr_new == 2;
*wages;
*Weekly earnings (EARNWEEK in 2022, EARNWEEK2 in 2024) times weeks worked;
*new var that reports weekly earnings using appropriate var depending on year and omits NIU cases;
gen wklyearn = .;
replace wklyearn = earnweek if year == 2022 & earnweek < 9999.99 & classwkr_new == 2;
replace wklyearn = earnweek2 if year == 2024 & earnweek2 < 999999.99 & classwkr_new == 2;
*new var that applies authors 50-week assumption for missing weeks worked;
gen weeksworked = .;
replace weeksworked = 50 if wklyearn != . & wksworkorg == 98 | wksworkorg == 0 & classwkr_new == 2;
*earnings
gen earnings = wklyearn * weeksworked if classwkr_new == 2;
*weight;
gen new_wt = round(earnwt) if classwkr_new == 2;
******self-employed use ASEC data;
*ft and desired pt workers;
replace fulltime = 1 if uhrsworkly >= 35 & uhrsworkly < 999 & classwkr_new == 1;
replace fulltime = 1 if whyptly == 2 & classwkr_new == 1;
*earnings;
*before summing incomes create alternate versions without NIU codes;
gen alt_incwage = incwage if incwage < 99999999 & classwkr == 1;
gen alt_incbus = incbus if incbus < 99999999 & classwkr == 1;
gen alt_incfarm = incfarm if incfarm < 99999999 & classwkr == 1;
gen alt_incss = incss if incss < 999999 & classwkr == 1;
gen inc_selfemp = (alt_incwage + alt_incbus + alt_incfarm + alt_incss) if classwkr == 1;
replace earnings = inc_selfemp if classwkr == 1;
*weight;
replace new_wt = round(asecwt) if classwkr_new == 1;
******bring it all together;
gen tru_numerator = 0;
replace tru_numerator = 1 if fulltime == 1 & earnings >= 25000;
tab tru_numerator if u3denominator == 1 [fw=new_wt];