Drawing conclusions about industry & occupational employment from CPS data

I’d like to understand the evolution of managerial and administrative occupations in different sectors of the US economy, and I’m currently using CPS monthly data for that (1983-2023), focusing on OCC2010 and IND1990. Wondering whether users who might have pursued a similar line of inquiry have some caveats / watch outs when it comes to the robustness of the employment data at this level of granularity. Many thanks!

I saw you have a separate post with a more specific question, which I will answer separately. However, I want to provide a general overview here.

For aggregating into broad groupings of codes (see those listed on the OCC2010 variable description), this should generally be fine. For more narrower subsets, it may be harder to make comparisons across time. Ultimately, this specific harmonization involves the loss of some detail with each transition across the original coding scheme. There are substantial changes between some of the schemes (e.g., for occupations, there are quite large changes between 1970 and 1980, and again between 1990 and 2000/2002). For your time period, it may be worth comparing results using OCC2010 to OCC1990).

The Census Bureau classifies occupation and industry responses into year-specific coding schemes that are updated approximately every 10 years; these unharmonized, year-specific values are available in the variables OCC and IND. Harmonized IPUMS occupation and industry variables rationalize these census coding schemes by cross-walking year-specific codes into a common scheme/the same year (e.g., 2010 for OCC2010 and 1990 for IND1990). This is done by using crosswalks that show how codes in a given year map onto the adjacent scheme (typically created with double-coded samples that also show the percentage of a previous code that is associated with a new code). Some codes may be contested across the new scheme (e.g., a 1:many relationship). IPUMS handles these contested codes using a modal approach, which can cause certain singular occupation codes to disappear/appear with transitions between schemes. Below is an example from the IPUMS CPS OCC1990 variable description:

… persons coded as “Gaming managers” in 2000 (2000 code 33), the Census Bureau determined that 35% would have been coded as “Managers, service organizations” in 1990 (1990 code 21), while 65% would have been coded as “Managers, food serving and lodging establishments” (1990 code 17). Thus, OCC1990 assigns a code of 17 to the cases in the 2000 IPUMS sample having an original 2000 OCC value of 33.

With consideration of Gaming managers alone, the OCC1990 variable will overstate the number of “Managers, food serving and lodging establishment” and understate the number of “Managers, service organizations” because of this modal assignment approach. Additionally, it is worth noting that IPUMS harmonized occupation and industry codes do not take demographics into account when crosswalking occupation codes, but those may be relevant for certain occupations that are disaggregated (e.g., using sex, race, age, or foreign-born status could allow for a more nuanced assignment rather than the modal approach that we use).

1 Like