Let me preface this question with an acknowledgement that there has been a lot of conversation on this forum about how to decode the differences between top codes, replacements, and swap values. I have combed through the postings about this and I am still having a difficult time getting my head around why maximum possible values exist in the CPS data from years 1996 - 2018. AHoerner’s question here clarifies that so-called “top code” values with a terminal ‘7’ value are essentially denoting item non-response. But if the swap and mean replacement procedures are in place for the CPS data from 1996 - 2018, why does one also find maximum possible values that exceed the thresholds for mean replacement and swapping? I am using the language of the revamped top code table here. So, this means that ‘maximum possible values’ generally have a terminal ‘9’ value and frequently consist of a string of ‘9’ values.
For example, I am using data from the 2016 CPS ASEC. I find after cleaning out the N.I.U. values that data for the variable INCSURV1 still includes values equal to 99,999. From the top code table, I see that this is the ‘maximum possible value’. However, since this data is from 2016, I am puzzled as I would think this value would not exist; instead, I would expect it to be swapped using the topcoding procedure for that year.
I have two outstanding questions, then: first, what do maximum possible values mean in this context where they are values that exceed the swap threshold but still remain in the data? Why aren’t they swapped, essentially? Second, what ought one do with them? Of course this is up to the researcher, but if they stand for actual incomes rather than ‘item non-response’ then wouldn’t one want to correct for them in some fashion since they are often outlier values?