Difference between "racenew" and "racebr" variables?


I am using data from NHIS 2017-2018. I cross-tabulated the racenew and racebr variables expecting them to be the same because racebr is a bridge variable to Post-1997 OMB standards, but they’re slightly different. Which one should I use?

Thank you,

My apologies, I made a typo: I’m using 2007-2018 data

The IPUMS variables RACEBR and RACENEW are based on separate underlying variables released by the NCHS in the original NHIS data files, though both variables are based on the same survey question. RACENEW reports the main racial background; in contrast RACEBR categories capture additional information, specifically about persons reporting more than one race. The OMB standards set the minimum categories to include; RACENEW generally meets these (though Native Hawaiian/Pacific Islander is suppressed because of privacy concerns) while RACEBR provides additional detail beyond the OMB standards.

The comparability tab for RACENEW provides additional information about how multiple race and other race responses are handled in this variable:

  • Beginning in 2000, persons self-reporting multiple Asian races were coded as “Asian” in RACENEW; persons self-reporting multiple American Indian/Alaskan native races were combined as “American Indian/Alaskan Native”; persons reporting any other combination of multiple races were coded as “Multiple Race.”
  • The “Other race” category is not available in RACENEW from 2003-2018; persons who report an OMB race group as well as “Other Race” in 2003-2018 are assigned to the OMB race group; persons who report only “Other Race” are treated as missing and their race is imputed.
  • Beginning in 2003, a new category, “Primary Race Not Releasable,” was added to preserve the privacy of persons whose race category had very few members in the survey (e.g., “Native Hawaiians/Other Pacific Islander”).
  • In 2000-2018, “unknown” responses were no longer included in the data. Instead, the NHIS began imputing race (and Hispanic origin) to improve overall quality of the data.

I would still expect that the majority of cases would have the same assignment between the two variables, but would not be surprised to see the following differences based on the comparability information in RACENEW:

  • Persons who are classified as “Multiple race” in RACEBR who are assigned to a single race category in RACENEW (e.g., persons who report one OMB race group and “Other race” or persons who report more than one Asian race).
  • Persons who have a more detailed Asian race in RACEBR who are assigned to the broad “Asian” category in RACENEW (e.g., persons who report their race as “Chinese”).
  • Persons who are classified as “Other race” in RACEBR who are assigned to any of the following RACENEW categories:
    • a single OMB race group (e.g., persons whose race is imputed because they don’t provide additional detail),
    • “Primary race not releasable” (e.g., persons who report a single race that cannot be released because of privacy concerns, such as Native Hawaiian/Pacific Islander),
    • “Multiple race” (I am less certain on this, but suspect that persons who report more than one “Other race” and did not select a main racial background, or who report “Other race” paired with a group that is not releasable may end up in this category; I also see cases where race/ethnicity was imputed to “Multiple race” in RACENEW for persons assigned to “Other race” in RACEBR).

Which variable you use depends on your specific research application; I would encourage you to think about how RACENEW might mask more detailed information available in RACEBR and if that is appropriate for your work.

Thank you so much for your reply! This is very helpful.
Something else I am seeing with the data is that those who are categorized as a singular race in RACEBR are classified as “multiple race” in RACENEW, which confuses me and wasn’t an example of the differences between the variable. For example, over 2% of my sample identified as Black/African American in RACEBR are identified as “multiple race” in RACENEW. As I am dealing with very small subpopulations and classifying my racial categories into non-Hispanic White, non-Hispanic Black, non-Hispanic Other, and Hispanic, it’s important for me to determine which category is most appropriate. Thank you!

I am unable to find documentation that sheds light on this specific situation. I was able to confirm that this seems to be happening to both persons who have imputed and originally reported race values, and that these relationships persist in the original data (e.g., they are not a result of an error in the IPUMS recoding logic). Regarding which variable to use, the guidance from the NCHS’s survey descriptions for NHIS data (here is the 2018 survey description) do not make a specific recommendation, but encourage you to read the descriptions of each variable carefully to ensure you are selecting the correct variable. I will share a table of the 2018 variables and descriptions from the survey description document and note which variables these correspond to in IPUMS NHIS (you can always look these up directly using the IPUMS - NHIS Concordance tool from IPUMS NHIS).

  • ORIGIN_I corresponds to the IPUMS variable HISPYN
  • HISPAN_I corresponds to the IPUMS variable HISPETH
  • RACERPI2 corresponds to the IPUMS variable RACENEW
  • MRACRPI2 corresponds to the IPUMS variable RACEA
  • MRACBPI2 corresponds to the IPUMS variable RACEBR

I suggest reviewing these descriptions to determine which variable is most appropriate for your specific application. You may also be interested in the NCHS page on Race and Hispanic Origin Information in the NHIS. This page includes information on the historical context of these variables, frequently asked questions, and a link for contacting NCHS directly with a question.