Discrepancy in France 1999 Dataset IND Variable Codes

Hello,

I’ve encountered an issue while working with the France 1999 dataset and its “IND” variable. There seems to be a mismatch between the codes in my dataset and those listed in the documentation.

In my dataset:

. keep if sample == “France 1999”
(4,768,013 observations deleted)

. tab ind

Industry, |
unrecoded | Freq. Percent Cum.
------------±----------------------------------
1000 | 47,322 4.10 4.10
1010 | 466 0.04 4.14
1030 | 11 0.00 4.14
1110 | 265 0.02 4.16
/* additional codes omitted */

However, in the IPUMS documentation lists completely different codes:

Code Label Frequency
011 Agriculture, forestry, fishing 47,322
020 Agriculture and food industries 31,350
031 Clothing, leather goods 38,008
/* additional codes omitted */

Could you please check on this or let me know how to reconcile these different coding schemes? Thank you!

Thank you for bringing these discrepancies to our attention. You can find the labels for IND codes in the France 1999 sample in the source variable FR1999A_INDCITI. We will revise the codes list to direct to the FR1999A_INDCITI page.

Source variables correspond to the variables in the original datasets and serve as inputs for harmonized IPUMS variables. IND does not recode the original industry data, but collates the sample-specific industry source variables (e.g., FR1999A_INDCITI) together into a single variable. Therefore, you can also obtain the same unedited industry data together with the corresponding labels by checking the IND source variables tab (see the screenshot of the tab below). This tab will display the source variables used for IND for all of the samples in your data cart.