Complex logistic regression

I am using the 2013-2018 National Health Interview Survey data to run a logistic regression analysis. My DV is cancer and IV is sleep. I have several covariates, such as age, sex, education level, income, and so on. I think I need to run complex logistic regression because the data is need to be weighted, but not sure how to do it correctly. How can I create a complex design and identify the appropriate variables and specify the sampling design in SPSS? Because I am receiving an error “This procedure ignores the weight variable.” How can I fix it?

I am not familiar with SPSS Complex Samples, but wonder if the IBM documentation (linked previously) or this general guidance from Wayne State might be helpful for you.

You should use STRATA as the strata variable and PSU as the cluster variable. Which weight variable depends on the specific variables you are including in your analysis; many of the sleep variables are from the sample adult questionnaire and analyses using these should be weighted with SAMPWEIGHT (this is true for variables from the cancer supplement in these years as well). I would specify the sampling as without replacement; page 24 of the NCHS sample design documentation for the most recent NHS sample design indicates that year-specific samples are not independent. Note that for any analyses of NHIS data that include both 2019 and 2020, there are repeat observations (see IPUMS NHIS variable NHISPIDPRVYR for more information).

I am also linking to a YouTube tutorial on regression analyses (in this example using NHANES data, but the principles should be generally comparable) using the Complex Samples functionality in SPSS.

Because I am not an SPSS user, I am afraid I don’t have much further general guidance. I hope these resources help.