IPUMS-DHS commands for STATA

This is my first attempt at data analysis using DHS data from the IPUMS platform. I have downloaded 3 data sets with the selected variables. Now I am at the analysis stage, I have found the exercises and webinars posted on youtube very helpful. I have some questions about the commands I have found. I am doing an analysis to find an association between Consanguinity and infant deaths. Using the All births as the unit of analysis.

  1. the webinar mentioned the following commands for the descriptive analysis: (IPUMS DHS Working with Geography Variables - YouTube… at 38:27 time)
    svyset idhspsu, strata (idhsstrata) weight(perweight) vce(linearized) singleunit(centered)

  2. But I have also found the following commands in the exercises online.
    tab (var1) sample [aw=perweight] ,col

QUESTION: They both give different results, Can I get more clarity on which one is more appropriate for my descriptive analysis?
Q) What is the svyset command and weights I can apply for my analysis?
Q: How can I carry out logistic regression with the IPUMS data obtained? I haven’t seen an example command for stata online.

I will really appreciate some guidance through this.

I just found one of the questions answered in the forum. The svyset command
svyset [pw=perweight], psu (idhspsu) strata (idhsstrata)

Will be awaiting for the rest of the responses.


I am glad you found clarification on how to apply weights using svyset. We will revisit the data training exercises to ensure that the weight suggestions there are correct and/or document why they differ from the svyset approach.

Logistic regression can be calculated in Stata using the “logistic” command followed by the dependent variable and then the independent variables. Here is some example syntax from the Stata manual on logistic:

logistic depvar indepvars

The “logit” command also works with similar syntax where y is the dependent variable and x1 and x2 are independent variables (from the Stata manual on logit):

logit y x1 x2

For more discussion about completing logistic regression in Stata, see this forum post.

Thank you Grace for all the help. I have found the answers I needed in the resources provided.

I am currently having trouble creating the variable for the father’s age at the time of birth of the child. I have been successful in doing so for the mother’s age but since father’s date of birth is not available in CMC format I am having some trouble finding out the way to generate this variable.
The following variables seem helpful HUSAGE, KIDDOBCMC and INTDATECMC.
My analysis is to find out risk factors linked to infant deaths in Pakistan using 2006,2012 and 2017 DHSs and the unit of analysis I have chosen is ALL BIRTHS.
Can you guide me in this matter further?

Best regards,

Father’s month of birth can be determined using variables that all report current age (at the time of the interview), which avoids the complication of combining data in “current” format with data in CMC (century month code) format. HUSAGE (Age of husband) reports the current age of the husband/partner of the woman (the individual surveyed in the BIRTHS unit of analysis), which can be combined with year (INTYEAR) and month (MONTHINT) of interview to determine the month of birth of the father. KIDDCURAGEMO can then be used in the same way (in combination with INTYEAR and MONTHINT to determine the month of birth of the child.