DHS datasets - "Not in Universe" and Missing values



I’m a little confused by the “not in universe” labeling - in particular how it might differ from missing values. I used the IPUMS-DHS website to pull several West African countries’ women surveys over several years and I have several variables with a “not in universe” label along with missing values.

I’m kind of assuming that “not in universe” simply means that the question didn’t apply to that woman? For instance, the variable DELDOC_01 being equal to “not in universe” simply means that that woman didn’t give birth in the last three to five years, because if she did she could have answered in either the positive or negative to having some kind of medical professional deliver the previous baby. However, assuming that my assumption is right, how do these women differ from the missing values? Or am I just completely off base and have no idea?

Thank you in advance for your reply.


Yes, you are correct in assuming that “not in universe” means “the question does not apply to this person.” You will find more exact information if you consult the “Universe” tab of a variable description. For example, DELDOC_01 for Bangladesh 2011 has the Universe of:

Ever-married women age 12-49 who gave birth in the 5 years before the survey.

Cases coded as “Missing” should theoretically have been covered by the question but have no response in the data. A woman who refused to answer a question would be coded as “Missing” for that variable; if an interviewer accidentally failed to ask a question to a woman theoretically covered by that question, then the woman would also be given a “Missing” response code.

Thanks for your question. I hope this helps.