Using NHIS Imputed Income Data (1997-2018)

Hi, I am working on a dissertation project that involves combining the NHIS 1997-2018 - thinking IPUMS would be great help in doing this, I’ve downloaded the data but am unsure how to appropriately use the imputed income variables on the file. For the public-use NHIS, there are five separate imputed income datasets that are read in together to create an overall income variable - but in IPUMS I only see the five variables. That said, I cannot figure out how to analyze the imputed income data in IPUMS appropriately. Any instruction would be helpful. Thank you.

When using income variables from the NHIS, you need to use all five imputation variables associated with each income variable to create a complete dataset. Five imputations are performed on NHIS income variables to address high numbers of missing values. Each imputation variable represents a different imputation using the same methods.

For example, the IPUMS NHIS variable EARNINGS (original NHIS variable ERNYR_P) must be used with the five imputation variables EARNIMP1, EARNIMP2, EARNIMP3, EARNIMP4, and EARNIMP5. The variable EARNINGS does not include any imputed values.

The NCHS provides information on how to use all the imputations to create a complete data set. This appendix to IHIS data brief number 2 includes some sample SAS code. This white paper from the NCHS describes the imputation process and also provides sample SAS code. I am not sure what statistical package you are working with, but the multiple-imputation suite of commands in Stata or the mice package in R would be good places to start. From NCHS documentation:

“After analyzing each of the M completed data sets resulting from multiple imputation, one can combine the results of the M analyses using software packages. SAS-callable SUDAAN is a software package for analyzing data from complex surveys, which includes a built-in option for analyzing multiply imputed data (Research Triangle Institute, 2012). IVEware is a free SAS-callable software package, which has different modules for performing various multiple-imputation analyses incorporating complex sample designs. IVEware can be downloaded from the Web site https://www.src.isr.umich.edu/software/. SAS users can also use SAS proc procedures to conduct statistical analysis on each imputed data and then use SAS proc MIanalyze procedure to combine results of analyses of multiply imputed data (SAS, 2016). Stata procedures (StataCorp LP 2009) for performing multipleimputation analyses and the mice (Multiple Imputation with Chained Equations) package in R (van Buuren S, Groothuis-Oudshoorn K, 2011) are also widely used to perform multiple imputation and the subsequent analysis.”