In looking at the 1990 chart of top codes and state medians (for values above top code) for HHINCOME and INCTOT, I see that top and bottom codes are given as N/A. Does this mean that top/bottom codes exist and were applied, but are not made public? If so, couldn’t a person simply look at the highest/lowest published value for national data and assume that the top/bottom code was equal to this? Also, if the top/bottom code is given as N/A, do the values in the table still refer to state medians for value above the (nonpublished) top code?

Or, is there some other way to determine the top/bottom codes for these variables?

I believe you are using this chart, but correct me if I’m wrong. FTOTINC and HHINCOME were topcoded by state and topcoded values were replaced with state medians. The replacement values are given by state in the chart.

I think the N/A in the first row means that there is no topcode cutoff such that all cases above this point nationally were given a replacement value. Instead, the topcoding and replacement was carried out by state. You couldn’t look at the highest published value in the national data to determine the topcode, as it would be a replacement value (a state median above some point) for a wealthy state. The values in the table still refer to state medians for values above some cutoff value, but the cutoff varies by state.

Thanks for the rely, Brandon, but I need to make sure I understand your answer.

You said that “FTOTINC and HHINCOME were topcoded by state and topcoded values were replaced with state medians”. Suppose that the state median for HHINCOME was $50k, that the threshold for the topcode was $100k, and that HH #1 responded with a value of $70k and HH #2 responded with a value of $200k. This means that HHINCOME for households 1 and 2 would be coded as $70k and $50k, respectively. That is, the ordering of these two households by household income would be reversed in the database, compared with their true ordering. Do I have this right?

If so, it makes it more difficult to produce a meaninful histogram of HHINCOME. I suppose one would have to search records for the given median value, assume that all HHs displaying the median value have in fact been topcoded, and then treat this subset of HHs as if their HHINCOME was larger than the largest value recorded. That is, the largest value recorded (for a state) would provide a (low) estimate of the actual topcode threshold. It also means that any HH that also happens to actually match the median income would wrongly be considered as a topcoded HH. Is all of this true? Or is there a better way to produce a meaningful histogram of HHINCOME?

I apologize for my confusing reply. As I mentioned in the second paragraph (but didn’t make clear in th first), the replacement value is the state median of incomes above some cutoff point. That is, suppose this cutoff point in some state is $100,000. Then the median of incomes above $100,000 would be calculated, and all incomes above $100,000 would be replaced with this median value.