Property Taxes (PROPTX99) Topcode

I’ve been investigating property tax topcodes in the ACS IPUMS variable PROPTX99 and its source variables.

The IPUMS documentation says its maximum category value (69) reflects reported values higher than 10,000 USD in all years for which the variable is available. Moreover, the editing procedure says: “If the reported real estate taxes (PROPTX99) exceed the top code, the value will be replaced with the top code.”

The ACS PUMS documentation of the Census Bureau says that reported values are topcoded if they exceed a “top-code national minimum value”.

For every year, the Census Bureau publishes tables showing this minimum value (which varies between years and sometimes even between states) as well as the mean of all reported property taxes exceeding it. This description and data is also available on the ACS IPUMS page on TOP CODED AND BOTTOM CODED VALUES FOR ACS/PRCS SAMPLES BY STATE.

Given this information, I would like to clarify how the 10,000 USD topcode was implemented in the IPUMS ACS variable PROPTX99:

Does the editing procedure use the year (and state) specific minimum value? Or does it simply set values above 10,000 USD equal to 10,000 USD – irrespective of the year (and state)?

Thanks for your answer!

The variable PROPTX99 is top coded at $10,000 in all ACS samples. The page on top and bottom coded variables in the ACS states that “all base dollar amounts are top-coded using the state mean of all cases greater than or equal to the top-code state minimum value. The only exception to the top-code state minimum value is TAX which uses the top-code national minimum value.” TAX refers to the variables TAXP or TAXAMT, the names of the original ACS variables corresponding to the IPUMS USA variable PROPTX99. Looking through the tables on the linked page, you will see that the top code for PROPTX99 does not vary by state or year. I was not able to find any documentation from the Census suggesting that the top code varies.

Please follow up with any additional questions or if you have a link to any information from the Census or IPUMS suggesting that the top code varies by year or state. Thank you.

Thanks for your reply!

Quick follow up:

The Census material I refer to is published on their ACS PUMS documentation website. Here for 2015. Navigate to “PUMS Top Coded and Bottom Coded Values”, download the linked file and open it with Acrobat. It contains a pdf and a csv file.

The csv file has variables called “taxtpct” and “taxp”.

The pdf has some documentation but it does not answer those two questions:

  1. How do the variables “taxtpct” and “taxp” compare to the uniform 10,000 USD topcode you mention? (They are always larger than 10,000USD and vary between states.)
  2. How do they relate to the correspondence between the 69 categories and the Dollar values which are listed in the description of PROPTX99? (Does 69 refer to a different Dollar value in different states?)

Thanks a lot for looking into this.

@Isabel_Pastoor brought this question to rest of the User Support team for discussion, since the answer to your questions is not at all obvious. After some extensive investigation, here is what we feel that we can confidently say about the topcodes for PROPTX99:

In each year, the column TAXTPCT in the Original PUMS csv files available at the IPUMS USA topcodes page gives the threshold above which the property tax variable is topcoded. How this is determined has changed over the years.

For 2016 and 2017: This is exactly 10,000.

For years prior to 2016: There is a (mostly) uniform threshold across states, which is the “top-code national minimum value”. This is always above 10,000 in these years. For a handful of small states each year, there is a different value for TAXTPCT, which is lower than then national, but still above 10,000. It appears that the Census Bureau treated these states differently, without mentioning it in the written documentation. In some cases, there is a blank value for TAXPCT. We believe the blanks mean that no cases were topcoded.

For 2018 and later: A state-specific threshold is used. In many cases this is below 10,000. Which coded value corresponds to the topcode for a specific state in a specific year is not captured in the IPUMS variable labels, so the tables at the topcodes page must be used to determine this.

For all years, the value of TAXP or TAXAMT or T_TAX (depending on the year) gives the mean value of cases above the threshold in a given state.

We believe that the Census Bureau uses a two-step process to assign the value of PROPTX99 for years prior to 2018:

–First, cases are topcoded according to the thresholds listed in the tables. The topcode is actually the mean value above the threshold, captured in TAXP or TAXAMT.

–Second, the topcoded variable is assigned to the codes 1-69, corresponding to different intervals. Thus the original topcode is again topcoded.

For 2017 and earlier, this means that 10,000 is the effective topcode in the publicly available data, since TAXP is always above 10,000.

Starting in 2018, the Census Bureau did away with its two-step topcoding procedure. Instead, they included the original topcoded value (TAXAMT) in the publicly-available microdata, even if it is above 10,000. Moreover, the value of TAXAMT is sometimes below 10,000.

In order to maintain consistency with earlier years, IPUMS recodes these (topcoded) values into the original set of codes 1-69. This is identical to the second step used by the Census Bureau in earlier years.

For example, Alabama in 2019 has a value of 5250 for TAXTPCT (the threshold) and 7500 for T_TAX (the mean value above the threshold. Topcoded cases (all cases above 5250) are assigned a code of 66 ($7000-7999), since that is the interval that T_TAX falls into. Note that it is NOT the case that everyone with a code of 66 in Alabama in 2019 has taxes above 7000. So both the topcode AND the threshold must be used to fully understand which cases are represented by the value of 66.

While this recoding loses some information in the original variable (where the topcode is above 10,000), it enhances comparability over time. If you would like to access the original data, you can use the source variables, for example US2021A_0110.

1 Like