Hi. I’ve recently created data extracts for each month since 2000 for the variables Statecensus, Ind, Hourwage, and Earnwt using the monthly CPS data source. I’ve noticed that there is significantly less data for Contruction ind code 770 and statecensus code 21. Why is this the case?
As a side question I have also been exporting my data into csv files which are required to read using our software. However, the files are so big that I can only pull 6 months at a time. It does not appear that I can use the select case feature. Are there any other ways that I could be pulling the data differently so that I can speed up the process of pulling data in the future?
Over the years, the Census Bureau has changed the ways it has categorized different occupations and industries. CPS samples from 1992-2002 use the 1990 census industry classification scheme while samples from 2003-2008 use the 2002 scheme. You can find the link to the different schemes on the IND codes tab. The code 770 refers to lodging places except hotels and motels in the former scheme and to construction in the latter. If you are looking to analyze workers involved in construction across these schemes, you will need to use the IND code 60 when analyzing pre-2003 samples. Alternatively, you might consider using the IPUMS harmonized variable IND1950, which consistently applies the 1950 industry coding scheme across all CPS samples. Specifically, IND1950 assigns IND = 60 for samples from 1988-2002 and IND = 770 for samples from 2003-onwards to IND1950 = 246.
The select cases feature only works with certain variables that need to be added to the data cart. If you would like to break your extracts by states, you can add the variable STATEFIP to your data cart and then go to the select cases option. If you are only interested in respondents with data on wages, then another option is to add the variable ELIGORG and select only respondents who are eligible for the Earner Study questions. Note that HOURWAGE is not available after April 2023. Researchers who want to compare hourly wages before and after this period should use HOURWAGE2 instead.