Deriving a nationally representative sample from family respondents in the NHIS


I will greatly appreciate it If someone can help with the issue of representativeness of the family respondents sample in the NHIS. To deal with the issue of proxy responses in the NHIS, I am limiting my analysis to just family respondents (famrespflag = 2). I wonder how close to the national sample my weighted sample would be. Since family respondents are chosen from most households, I understand that the sample would still be representative of U.S. households. However, at the individual level, I am not sure how representative the family respondents sample would be. Thank you for your help.

I would not necessarily expect family respondents to be representative of the individuals included in the broader NHIS sample. For the 1997-2018 data you can compare the family respondent’s characteristics to the rest of the family roster. I ran a few quick cross-tabs for these years; the family respondents are more likely to be female (especially in households with 3 or more persons using FAMSIZE) and the householder (using RELATE). You may want to compare basic demographic characteristics of your family respondents to the overall NHIS sample; I am also linking the sample code for variance estimation for subpopulations given the complex survey design of the NHIS if you are restricting your analytical sample in this way. Please let me know if you are working with the 2019 data instead; the NHIS underwent a major questionnaire redesign in 2019 and the information I have shared is based on the 1997-2018 design.

Thank you so much for your help. I am working with 1997-2018 data, so your explanation works for me. I will go through the variance estimation guide as you suggested.

