You state here: https://cps.ipums.org/cps/income_cell_means.shtml that Larrimore et al. provide consistent cell means for 1976 to 2007. Immediately following this, you provide files containing IPUMS-consistent cell means for 1976 through 2000. Are these the Larrimore et al values, or some other set of values? If the former, is there some reason you do not provide the 2001-2007 values from Larrimore? Or do you provide them elsewhere?
On your page Income Components: Topcodes, Replacement Values, and Swap Values you provide cell mean replacement values for the years 1996 through 2010. Am I correct that these values, and not the top codes, are what is in the IPUMS micro data downloads now? Are these values the same as or different from the Larimore et al. cell means through 2007? If different, where do they come from? In the various tables presented on this page, are the numbers provided as top codes the original CPS top codes before they were replaced by cell means? Or are they, as they appear to me, the cutoff values above which incomes were top-coded and subsequently replaced by cell means?
I have a strong request concerning this page. The term “top code” is used sometimes to refer to an uninformative code (like 99997) indicating that the observed value was above the top-coding threshold, and sometimes to refer instead to the threshold, and sometimes it is not clear which. So for instance, in the years 1999-2002 in the first table, are the values of 15,000, 20,000, and 25,000 actually top codes? Or are they top code thresholds? And in the various tables like “1996 Income Topcodes” for different years, the values that are identified as topcodes look to me a lot more like top coding thresholds. I’d like somebody to review the language on this page and make sure things are properly identified as top code or top code thresholds.
Were the thresholds originally used as the topcodes in these years, before being replaced by cell means and then subsequently by swap values?
Under the heading Income Component Rank Proximity Swap Values , here https://cps.ipums.org/cps/income_cell_means.shtml (at the bottom of the page, below the section on cell means) you provide a file of income replacement values for 1976-2010. The link there to the original Census files is broken, and I did a lot of searching of the Census website looking for either the originals or updated versions of these files and I found no reference to them. If I were to download CPS data from the Census now, would I see top codes, cell means, or swap codes in these years?
I am confused about how the values in this swap values file relate to top codes. I thought that the swap values were supposed to completely replace top codes. However, the file contains 1044 values of 99997, i.e. in a bit less than 1 percent of the individual records are so coded. These certainly look like top codes. To me, this seems to imply the old cell means for values above the top coding threshold have only been replaced by swap values up to a certain threshold value, higher than before, above which they are still top-coded, and moreover, not replaced by cell means. Do you know if this is correct? If so, do we know what the new thresholds are? Do the IPUMS-CPS records for 2011 and following years also contain top codes in addition to swap values? And if so, do we have the thresholds for these top codes?
I thought the CPS had gotten past top-coding. <Grrrr!>