Topcode Replacement Values IPMUS CPS ASEC

Hey IPUMS Forum,

I intend to use incwage, incbus and incfarm in an anaylsis utilizing IPUMS CPS ASEC samples from 1976 to 2016. After reading through the extensive documentation on topcoding of income variables I thought it wise to harmonize the topcoding method over the time frame by replacing the topcoding method of mentioned three variables in 1976-2010 to the “Rank proximity swap values” method using the “swapvalues” file. However, looking through the values of the variables in the swapvalues file some values seemed odd to me. More specifically, I fail to understand why the swapvalues file contains a non-zero value in cases that have original reported incomes that are well below the topcode threshold of the same variable in a given year.

For example, in the 1988 ASEC sample serial 318 pernum 1 has a value of 24000 in incwage (combining incliongj 19,000 and oincwage 5,000). The same respondent has a value of 20,000 in incwage_swap. Another such example would be serial 92, pernum 1, in 1976 - this respondent has a reported incwage of 37,500 which is replaced with a value of 39,000 when using incwage_swap. Many such examples exist in the data.

As far as I understand only respondents with income above or euqal to the topcode threshold were “ranked from lowest to highest and systematically swapped with other values” but according to the topcode tables provided these are far below the topcode threshold of the corresponding year (199,998 [in the data it seems to be 99,999 but either way] and 50,000 respectively).

Why do these cases have non-zero values in the swapvalues file? Should I use these values to replace the existing ones? What am I misunderstanding here?

Any response would be much appreciated. In any case though, thank you very much for your time.

These cases of seemingly non-top-coded values being swapped with other values (also below the top-code threshold) are very strange. It is possible that the Census Bureau, while generating these swap value files, took the opportunity to correct some values for these older files. I would recommend contacting the Census Bureau directly as they will have access to the restricted files and possibly be able to tell you the source of these strange swap values.

I’m sorry I couldn’t be more helpful.

I’m pretty sure that what is happening here is that when they have to swap out one value they necessarily have to swap out two. Otherwise you could recover the original value from the aggregate value by subtraction. I would have said that irt made more sense to swap whole households, so that there is no distortion of behavioral relationships, but looking at the swap values file it does not look like they did that.