Question about svyset and IPUMS


I’m having some trouble with svyset and an ipums DHS merged sample. Is it correct to use svyset v021[pw=perweight], strata (v022) - or should I be using clusterno instead of v021 and strata instead of v022?


As per the STRATA (V022) variable description, you should be using STRATA and PSU (though you may prefer to use IDHSSTRATA as the original STRATA values may not be unique when pooling samples):

The DHS Program recommends using STRATA along with the variable PSU (V021) to account for the impact of the sample design clustering on the estimates of variance and standard errors.

When pooling samples, users are advised to use IDHSSTRATA, which provides a unique numeric value for each stratum across all samples. IDHSSTRATA concatenates the variables SAMPLE and STRATA.

Note that the IPUMS DHS variable STRATA is the same as V022–the only difference is the variable name. Because some researchers may be more familiar with the original DHS variable name, we provide both mnemonics in IPUMS DHS.