Not Yet Classified Occupation Codes in 1900-1930 full count Census

Over the past few years, I have been working with the 1900, 1910, 1920 and 1930 full count Census products from IPUMS and have noticed that over time, the number of observations with occupations classified as “Not Yet Classified” has been gradually declining (which is great!). I was wondering, is it possible to get any sense of the methodology being used to classify/assign difficult to code occupations (or the reasons for why an occupational response might be listed as “not yet classified” in the first place)? In particular, is the Multigenerational Longitudinal Panel data being used at all to fill in these gaps (using occupational data from the same person in recent years) or is the assignment of occupations simply coming from contemporary information? Thanks so much!

Most often, an individual’s occupation is labeled “Not yet classified” due to unclear, illegible, or low frequency strings that we don’t have the resources to code for at the time. OCC codes can then be reassigned in the historical data depending on what information can be gleaned from the occupation string, as well as a number of other variables including industry, class of worker, and occasionally sex and relation to the household head as well. In other cases, we might have a legible and common enough occupation, but it’ll be labeled “Not yet classified” if the occupation seems out of place given the other available information. In these cases, we review these other variables to help classify the occupation code. We do not use data from other decennials via the IPUMS Multigenerational Longitudinal Panel in this process. With revised versions of the historical full count files, the IPUMS historical team has introduced new coding rules to decrease the number of cases of not yet classified occupations. You can find notes and guidelines for coding occupations from each sample on this page.