1940 full count top code incwage

Dear IPUMS,

I checked the proportion of observations whose incwage is topcoded in the 1940 full count data (below).

. fre incwage

incwage -- wage and salary income
-----------------------------------------------------------
              |      Freq.    Percent      Valid       Cum.
--------------+--------------------------------------------
Valid   0     |   5.03e+07      38.10      38.10      38.10
        1     |       6290       0.00       0.00      38.11
        2     |       7324       0.01       0.01      38.11
        3     |       8242       0.01       0.01      38.12
        4     |       6298       0.00       0.00      38.12
        5     |      14525       0.01       0.01      38.13
        6     |      14935       0.01       0.01      38.14
        7     |       6322       0.00       0.00      38.15
        8     |      17657       0.01       0.01      38.16
        9     |      10371       0.01       0.01      38.17
        10    |      43358       0.03       0.03      38.20
        11    |       3661       0.00       0.00      38.21
        12    |      30600       0.02       0.02      38.23
        13    |       5170       0.00       0.00      38.23
        14    |      10412       0.01       0.01      38.24
        15    |      39489       0.03       0.03      38.27
        16    |      15786       0.01       0.01      38.28
        17    |       3848       0.00       0.00      38.29
        18    |      23881       0.02       0.02      38.30
        19    |       2756       0.00       0.00      38.31
        :     |          :          :          :          :
        4982  |          9       0.00       0.00      68.18
        4983  |          9       0.00       0.00      68.18
        4984  |         14       0.00       0.00      68.18
        4985  |         22       0.00       0.00      68.18
        4986  |          8       0.00       0.00      68.18
        4987  |         14       0.00       0.00      68.18
        4988  |         21       0.00       0.00      68.18
        4989  |          8       0.00       0.00      68.18
        4990  |         64       0.00       0.00      68.18
        4991  |         13       0.00       0.00      68.18
        4992  |         65       0.00       0.00      68.18
        4993  |         14       0.00       0.00      68.18
        4994  |          7       0.00       0.00      68.18
        4995  |         35       0.00       0.00      68.18
        4996  |         26       0.00       0.00      68.18
        4997  |         12       0.00       0.00      68.18
        4998  |         18       0.00       0.00      68.18
        4999  |         69       0.00       0.00      68.18
        5000  |     393030       0.30       0.30      68.48
        5001  |   4.16e+07      31.52      31.52     100.00
        Total |   1.32e+08     100.00     100.00           
-----------------------------------------------------------

Then I checked the distribution in 1940 1%. I assume the two should be similar if 1% is a random sample of the full count. But they differ a lot in terms of the share of individuals whose incwage is topcoded at 5001:

-> sample = 1940 1%

incwage -- wage and salary income
------------------------------------------------------------
               |      Freq.    Percent      Valid       Cum.
---------------+--------------------------------------------
Valid   0      |     595645      44.07      44.07      44.07
        1      |        109       0.01       0.01      44.07
        2      |        143       0.01       0.01      44.08
        3      |         75       0.01       0.01      44.09
        4      |         84       0.01       0.01      44.10
        5      |        117       0.01       0.01      44.10
        6      |        133       0.01       0.01      44.11
        7      |         68       0.01       0.01      44.12
        8      |        148       0.01       0.01      44.13
        9      |        274       0.02       0.02      44.15
        10     |        505       0.04       0.04      44.19
        11     |         49       0.00       0.00      44.19
        12     |        412       0.03       0.03      44.22
        13     |         60       0.00       0.00      44.23
        14     |        121       0.01       0.01      44.24
        15     |        460       0.03       0.03      44.27
        16     |        182       0.01       0.01      44.28
        17     |         37       0.00       0.00      44.29
        18     |        276       0.02       0.02      44.31
        19     |         38       0.00       0.00      44.31
        :      |          :          :          :          :
        4923   |          1       0.00       0.00      74.76
        4930   |          1       0.00       0.00      74.76
        4935   |          1       0.00       0.00      74.76
        4940   |          4       0.00       0.00      74.76
        4943   |          1       0.00       0.00      74.76
        4946   |          1       0.00       0.00      74.76
        4948   |          1       0.00       0.00      74.76
        4950   |          5       0.00       0.00      74.76
        4952   |          1       0.00       0.00      74.76
        4965   |          1       0.00       0.00      74.76
        4976   |          1       0.00       0.00      74.76
        4980   |          2       0.00       0.00      74.76
        4990   |          1       0.00       0.00      74.76
        4992   |        186       0.01       0.01      74.78
        4993   |          1       0.00       0.00      74.78
        4997   |          1       0.00       0.00      74.78
        4999   |          1       0.00       0.00      74.78
        5000   |       3867       0.29       0.29      75.06
        5001   |        273       0.02       0.02      75.08
        999999 |     336811      24.92      24.92     100.00
        Total  |    1351732     100.00     100.00           
------------------------------------------------------------

Is it because those who are NIU for incwage in the full count data are also assigned the topcoded value of 5001? I checked again the distribution of incwage among people younger than 14:


. fre incwage if age<14

incwage -- wage and salary income
-----------------------------------------------------------
              |      Freq.    Percent      Valid       Cum.
--------------+--------------------------------------------
Valid   0     |    1396944       4.56       4.56       4.56
        1     |        532       0.00       0.00       4.57
        2     |        306       0.00       0.00       4.57
        3     |        577       0.00       0.00       4.57
        4     |        221       0.00       0.00       4.57
        5     |        360       0.00       0.00       4.57
        6     |        340       0.00       0.00       4.57
        7     |        102       0.00       0.00       4.57
        8     |        354       0.00       0.00       4.57
        9     |        159       0.00       0.00       4.57
        10    |        436       0.00       0.00       4.58
        11    |         79       0.00       0.00       4.58
        12    |        261       0.00       0.00       4.58
        13    |         63       0.00       0.00       4.58
        14    |         63       0.00       0.00       4.58
        15    |        347       0.00       0.00       4.58
        16    |        109       0.00       0.00       4.58
        17    |         36       0.00       0.00       4.58
        18    |        167       0.00       0.00       4.58
        19    |         19       0.00       0.00       4.58
        :     |          :          :          :          :
        4680  |          4       0.00       0.00       4.89
        4700  |          4       0.00       0.00       4.89
        4720  |          1       0.00       0.00       4.89
        4731  |          1       0.00       0.00       4.89
        4753  |          1       0.00       0.00       4.89
        4800  |         31       0.00       0.00       4.89
        4811  |          1       0.00       0.00       4.89
        4820  |          1       0.00       0.00       4.89
        4840  |          1       0.00       0.00       4.89
        4847  |          1       0.00       0.00       4.89
        4866  |          1       0.00       0.00       4.89
        4868  |          1       0.00       0.00       4.89
        4880  |          2       0.00       0.00       4.89
        4900  |          2       0.00       0.00       4.89
        4920  |          2       0.00       0.00       4.89
        4928  |          1       0.00       0.00       4.89
        4992  |          1       0.00       0.00       4.89
        4998  |          1       0.00       0.00       4.89
        5000  |        530       0.00       0.00       4.89
        5001  |   2.91e+07      95.11      95.11     100.00
        Total |   3.06e+07     100.00     100.00           
-----------------------------------------------------------

It seems that my guess is only partly correct: most of these unqualified people have incwage of 5001. However, about 5% of them also have valid incwage values lower than 5001. Why do they have seemingly normal incwage values while they should be NIU?

The issue with NIU codes being miscoded as topcodes (5001) is a known issue with the 1940 fullcount that was introduced earlier this year. It has been fixed internally and the fix will be made public in the late summer or early fall.

As for the non-NIU values for individuals under age 14, I am consulting with my colleagues about this and will post again when I have more information.

Hi Matthew, thank you for the information! Just a quick follow-up question: is it possible for us to adjust the miscoding (about NIU and 5001) by ourselves (if it is a very simple task)? My thought is that I can simply recode everyone under age 14 from 5001 to NIU, but I am not sure if there is any other problem/miscoding involved in this issue that I am not aware of.

Regarding manual recoding of individuals to NIU: IPUMS USA doesn’t generally enforce universe rules if they are not perfectly followed by the original microdata. Because of this, assigning NIU codes based on the stated universe “Persons age 14+, not institutional inmates” is probably not going to exactly replicate the original data. That said, unless you have a specific reason to care about people who should be NIU but have apparently valid data, I think you’re better off just imposing the restriction yourself. So to answer your question, yes that should be OK (though you’ll need to code both children under 14 and institutional inmates (GQ=3) to NIU). I’m not aware of any other issues.

Regarding individuals under age 14 with data for INCWAGE: these are likely cases where the enumerator recorded data for a person when they were not supposed to. The IPUMS USA team plans to impose the stated universe for INCWAGE on these cases in the next data release, which will set these individuals to NIU.

1 Like