The CPS documentation for OCC lists the codes that correspond to each year. However, a lot of these seem to be wrong.
For example, 1971 uses the 1970 Census codes, even though the documentation says it uses the 1960 Census codes. In 1969 and 1970, most of the dataset uses 1960 Census codes, but there are a few respondents that use the 1970 Census codes. In 1977, there’s an entry which uses an OCC code which doesn’t exist in the 1970 Census codes but does exist in the 1980 Census codes.
So far, I’ve been going through and correcting these manually for my analysis. But this is still really dangerous, because so far I’m only able to spot the errors that occur because there’s an incorrect code. However, I am concerned that there may be even more “silent errors” where the same code in the 1960 and 1970 Census codes have entirely different meanings, and so interpreting them incorrectly is just leading to incorrect results.
Could someone please go through more systematically and correct these errors?
IPUMS generally does not correct errors that are present in original (source) data, such as the original CPS data. We document systematic errors for data users when we become aware of them, but seek to preserve the original data in most circumstances. In many cases, we would not be able to correct errors in the original data—in the cases you describe, for example, IPUMS staff would have no way of knowing what a stray occupation code would represent for an individual in the CPS.
I can confirm that the original CPS documentation states that the 1971 ASEC uses the 1960 census occupation coding scheme (see page 51 of the 1971 ASEC codebook). However, you are correct that the data appear to better match the 1970 occupation coding scheme. While IPUMS will not change occupation codes for these data, we will update our documentation to reflect that the most appropriate set of codes to use for these data is likely the 1970 occupation codes. However, I will note that I am not able to find any information from the Census Bureau or BLS to confirm that the 1971 ASEC uses the 1970 occupation coding scheme. Discrepancy between official CPS documentation and the data themselves is fairly common with many of these older CPS samples.
In cases where there are a small number of codes in the original CPS data that do not match up with the occupation coding scheme that was used in that year, IPUMS is not able to do anything to “correct” these codes or determine which occupations they correspond to. I understand this can be frustrating for researchers as errors like these can introduce some uncertainty. If you do see evidence of a systematic error or one that affects a large number of cases, please inform us here in the forum or by emailing ipums@umn.edu.