Converting occupation codes accross years

Is there a document (or better yet a script) that contains a mapping between all the occupation classification across all changes that the Census Bureau implemented? That is, something like a table that maps all occupation codes to the classification for all years? As far as I know, the different mappings that exist are only between 2 consecutive changes, not “universal”.

IPUMS provides harmonized versions of occupation (and industry) that do just this–coding all the different iterations of the Census occupation coding scheme to a single version of it; we generate these by compiling the crosswalks you describe that track changes across two adjacent schemes. See the variables OCC1990 and OCC2010 in IPUMS CPS. These have not been updated to include the most recent updates to the 2018 Census occupation codes (introduced to CPS in 2020); however, their counterparts in IPUMS USA (OCC1990, OCC2010) have been updated. Information on creating the OCC1990 variable through the 2000 coding scheme is described in a BLS working paper; another working paper describes the methodology for adding 2010 (which was also used for updating the variables to include data coded using the 2018 scheme). Contact ipums@umn.edu directly if you would like the underlying metadata we use to generate these harmonized measures.

1 Like

I was a little confused by the guidance in the codebook entry for OCC1950.

“User Caution: The translation of occupation codes into the 1950 classification is particularly problematic for 1980 onward. The Census Bureau significantly reorganized the occupational classification scheme in 1980 and again in 2000. Comparisons across the post-1980 period and with earlier years will be more distorted than similar comparisons across other decades. Researchers focusing on these samples should consider using OCC1990.”

I wish to track changes in occupational composition from 1950 through 2019. Is your recommendation to use OCC1990 for the decades in the current analysis? Does this resolve the distortion described in the codebook?

On a related note: For some occupation codes of interest to this analysis, OCC1990 offers the clearest definition, and for others it’s OCC2010. In these cases, would 2010 be a reasonable alternative that offers an undistorted comparison with earlier decades?

You may find this useful:
https://www.ddorn.net/data.htm

Thanks, @molivo. I’ll take a look. @KariWilliams, can you offer any advice on my questions above?

@KariWilliams is currently on leave, but I’ll try to answer your question. IPUMS does not have a specific recommendation for which occupational coding system to use. The variable description page for OCC1990 states:

We chose the 1990 scheme as the standard for OCC1990 so that no year’s occupational data would be forced to bridge both of the two most significant changes in census-based coding schemes: from 1970 to 1980 and from 1990 to 2000. In OCC1990, all samples from 1968 onwards bridge no more than one of these major shifts. For this reason, the variable may be preferable to OCC1950 for the samples from 1980 onward. Sensitivity testing suggests that OCC1990 performs very similarly to OCC1950 for most purposes.

For this reason, OCC1990 may be somewhat less distorted than either OCC2010 or OCC1950 for your years of interest, so it may be your best choice. But I wouldn’t say that it resolves the distortion.

@Matthew_Bombyk covered this nicely! The only thing I would add is that you might select the occupation scheme to use based on your specific research application and the occupations/level of detail you require. For example, recent schemes include much more detail about computing and healthcare occupations. Because this detail isn’t available in earlier years, you will see that some of the more recent codes have zero cases in earlier samples; however, OCC2010 would provide detail for these occupations. Alternatively, if you are less interested in specific occupations or plan to aggregate more detailed occupations into larger groups anyway, choosing something closer to the midpoint of your reference period may be a nice balance.