I’m trying to perform an analysis where I take microdata level OCCSOC responses from ACS and pair them with the typical education needed for entry (from the BLS standards I’m coming up with a lot of N/As, either because the OCCSOC response is coded with a 0 at the end (11-2020 for example) where it doesn’t appear in the education file or because it is coded with a number of X’s at the end (ex. 11-10XX).

I’m wondering what the reasons for these atypical SOC codes are, and if there’s anything I can do about them?

The ACS does not originally code Occupation responses into SOC categories, but rather codes them into a specific Census Occupation coding structure. While this Census Occupation coding structure is based on the SOC framework, the two coding structures do not perfectly map onto one another. Codes including XX (and sometimes YY) represent combinations of SOC categories that translate to a single Census Occupation code. IPUMS-USA includes a table that lists which categories are included in these combined codes here.

The codes that end in 0, such as 11-2020, appear to also represent combinations (11-2020=Marketing and sales managers, while 11-2021=Marketing managers and 11-2022=Sales managers according to the BLS page you linked to). The 2012 ACS Occupation code list states “101 aggregated categories were combined for confidentiality reasons, including because the category contained fewer than 10,000 people nationwide.”

