The notes for the preliminary full-count data say, “Coded variables derived from string variables are still in progress. These variables include: occupation and industry.”
Does this mean we should avoid fields such as OCCSCORE in the preliminary full-count data for statistical purposes? Can more be said about the patterns of missingness or scope for miscoding in these variables?
OCCSCORE is dependent on OCC1950, which is available for the preliminary full-count 1910 sample. Therefore, there is no reason you should avoid using it in your analysis. There are other occupation and industry related variables which are not yet available for the sample.
As noted here, we have allocated missing observations and edited some inconsistencies for several variables in the sample, including OCC1950. If you would like to know the extent to which observations have been allocated/edited, you can add the flag variable, QOCC to your extract.
2 posts were split to a new topic: Flags in 1940 full count data