You indicate, here: https://cps.ipums.org/cps/income_cell… that the swap values generally replace cells that have previously been top-coded. So we have three different historical high-income replacement methods: top codes, average incomes for all top-coded values for a demographic group, and swap values; and we have swap values for 2011 on and then a separate set of swap values supplied later for 1975 to 2010.
A preliminary question: I had previously assumed that entire records were swapped when a value exceeded the swap threshold. From looking at the swapvalues.txt .csv file, it now appears to me that only individual cell values are swapped. Is this correct? Are the “income bands” used to determine the households selected for the swap income component bands for the swapped component, or aggregate income bands for the individual or household in question?
I have questions concerning the summary values constructed from these values.
First, concerning what is in the standard IPUMS CPS extracts:
-
For years where only topcodes are available, are summary values constructed by treating the top-codes as if they had the value of the topcode threshold?
-
For years where average values are available, have these been inserted into the IPUMS-CPS extracts in place of the top-codes? If so, are summary values reported in IPUMS extracts constructed by replacing the topcoded values with average values and then totaling? Or by totaling the raw, non-topcoded values and then replacing with an average if the raw total exceeds the topcoding threshold? Or is some other method used? Or are the summary values not adjusted, so we should recalculate them?
-
For 2011-on swapped values, does the swapping completely replace topcoding, or is there still a (presumably higher) topcode on swapped values? If the latter, are these values provided anywhere? In either case, are summary values after swapping strictly the sums of the component values, or is some more complicated method used to calculate these totals?
-
Are the retroactively provided swap values for 1975-2010 calculated in exactly the same way as the 2011-on values? If not, where are the differences?
-
Was there a special, different method of topcoding summary values in 1990, using state medians above threshold rather than sum of topcoded items? (See this question: How should user's deal with topcoded values for tax--person variables pre 2010 when no swapvalues exist?). Or is this an ACS method? I don’t see any survey identification with the question. The ASEC high-income summaries are generally averages, not medians, correct?
-
If the answer to any of these questions is “We don’t know?”, is there anyone at the Bureau that you would suggest as a good contact?