Hey IPUMS Forum,
I intend to use incwage, incbus and incfarm in an anaylsis utilizing IPUMS CPS ASEC samples from 1976 to 2016. After reading through the extensive documentation on topcoding of income variables I thought it wise to harmonize the topcoding method over the time frame by replacing the topcoding method of mentioned three variables in 1976-2010 to the “Rank proximity swap values” method using the “swapvalues” file. However, looking through the values of the variables in the swapvalues file some values seemed odd to me. More specifically, I fail to understand why the swapvalues file contains a non-zero value in cases that have original reported incomes that are well below the topcode threshold of the same variable in a given year.
For example, in the 1988 ASEC sample serial 318 pernum 1 has a value of 24000 in incwage (combining incliongj 19,000 and oincwage 5,000). The same respondent has a value of 20,000 in incwage_swap. Another such example would be serial 92, pernum 1, in 1976 - this respondent has a reported incwage of 37,500 which is replaced with a value of 39,000 when using incwage_swap. Many such examples exist in the data.
As far as I understand only respondents with income above or euqal to the topcode threshold were “ranked from lowest to highest and systematically swapped with other values” but according to the topcode tables provided these are far below the topcode threshold of the corresponding year (199,998 [in the data it seems to be 99,999 but either way] and 50,000 respectively).
Why do these cases have non-zero values in the swapvalues file? Should I use these values to replace the existing ones? What am I misunderstanding here?
Any response would be much appreciated. In any case though, thank you very much for your time.