Hello, I’m working with the full count 1940 census in Stata. Inconsistently, some of the values for erscore50 are labeled and imply an actual value of an order of magnitude smaller (see screenshot below). For example, the label for observations with an erscor50 of 44 imply an actual erscor50 of 4.4. I’ve noticed this only seems to happen for integers, which makes me think the integer is the actual value (i.e. 44 is the actual value, not 4.4). But, I don’t know for sure and I can’t find anything about this. Does anyone know for sure? Thanks!
I may not be understanding, but it should have one implied decimal, right? “ERSCOR50 has one implied decimal. For example, an ERSCOR50 value of 0061 should be interpreted as 6.1. This division is performed automatically in the extract setup files.”
Thank you for bringing our attention to this issue. You’re correct that here is an error in the data label. The correct value is provided in the data code and should be used without the accompanying label. In the example that you shared, 44 is the correct value and the 4.4 label should be disregarded.
While the original fixed-width .dat data files have one implied decimal place, in formatted data files (such as your Stata extract) this division is performed automatically. As a result, the provided data code is correct while the labels add an unnecessary additional decimal point in cases where the value is an integer. I’ve shared your findings with my colleagues on the data team who will investigate further and fix this discrepancy.