I am trying to use the CPS microdata to calculate approximately how many workers earn an hourly wage that is (roughly) equivalent to the minimum wage for the state said workers work in for each state and over time. I have already scraped the US DOL site on minimum wage by state laws for the wage part of this.
From my understanding of the CPS microdata I will need to utilize the HOURWAGE variable and apply the EARNWT variable in order to get the number of people in the population a specific state (STATEFIP being another variable I will pull) earning said wage for said year.
According to this very helpful reply to a question on applying statistical weights to calculate weighted averages in the IPUMS forum, the sum of the PERWT variable in a specific year & state is the total weighted population. If the EARNWT variable is utilized in the same way the PERWT variable I in theory could assume the same and divide the individual EARNWT variable by the sum of the EARNWT variable for each state & year in order to get a proportion of that state’s population (unsure if this is a variable I should get from the Census via another avenue or also pull from the CPS to be consistent) that made HOURWAGE hourly.
Lastly I noticed two things with the HOURWAGE variable. First being that the code “999.99” = N.I.U. (Not in Universe). I’m assuming I should treat rows with this variable as a “NULL” and remove them. Second is that HOURWAGE is topcoded. I read more on this page about topcoding but to be honest didn’t really understand the function of it or how it would be applied in this analysis.
If it isn’t glaringly obvious this is my first time working with CPS data in IPUMS and don’t really have much experience with how to do basic analyses of this nature with survey data like the CPS so I apologize for how rudimentary these questions I’m sure come across. I am more of a learn by doing type of person but have noticed both the FAQ and tutorials/training materials on the IPUMS CPS site. However I would also be grateful for any other tips/pointers/learning & reference materials for beginners.
In order to get the proportion of workers earning below the minimum wage, you’d want to sum EARNWT for all workers within the state earning less than the minimum wage, and then divide that by the total sum of EARNWT within the state.
Regarding NIU values in HOURWAGE, you can treat them as null observations, but I wouldn’t remove them, because this will also remove anyone who is not paid hourly, which is an important part of the denominator in your calculation. You can find who is included in the universe for this variable each year by consulting the universe page for HOURWAGE. Please note that this excludes anyone who is not paid by the hour (according to PAIDHOUR). Although this will capture most minimum wage workers, to get a more complete measure you may want to calculate the wage for those not paid hourly, by dividing EARNWEEK by UHRSWORK1.
Topcodes obscure detail in responses for those at the upper end of a distribution (e.g., high wage earners in the case of HOURWAGE) to protect their anonymity; values exceeding the topcode threshold are replaced with the topcode. Because you are focused on minimum wage workers, the topcodes aren’t relevant to this specific analysis.
Thank you so much for getting back @Matthew_Bombyk and apologies on the delayed response.
Good to know that if I want to get the proportion of workers earning at or below the minimum wage I would sum EARNWT for workers in a state making minimum wage and below and divide the the total sum of EARNWT within the state.
What I am looking for is slightly related which is the estimate of the raw number of people in each state that are making that state’s minimum wage (or below, which I am assuming would be very few since the FSLA covers most workers) for each year. In this case the population I am interested in is just the people making minimum wage (or below) for their state. Is it possible to use EARNWT (or another CPS variable via IPUMS) to get said total number estimate?
Duly noted on the NIU values for HOURWAGE and the universe of people included for HOURWAGE. I will also use EARNWEEK / UHRWORK1 to try and capture workers making at or below minimum wage who are not paid hourly (again I’m assuming this wouldn’t be that large a number).
Also, thanks so much for the explainer on topcodes, I kind of assumed it was an anonymizing thing but didn’t know for certain.
Hi @Adrian_Nesta, glad to help! To estimate the raw number of individuals earning the minimum wage, just sum up the EARNWT for all the people who meet that condition. It’s the same as the proportion, except you don’t divide by the sum of EARNWT in the state.