IDHSSTRATA & IDHSPSU: Inconsistent Lengths

[Sorry: cross-posted on DHS User Forum)


Good evening - I am appending / pooling recent surveys from a number of African countries. I realized that the the length of the key stratification variables: IDHSSTRATA and IDHSPSU are different betweeen countries.
According to IPUMS documentation, IDHSSTRATA is supposed to be a 9-character variable, while IDHSPSU is a 10-characters.

But as the following example shows, the character length differs among countries. Does the length have to be the same across countries ? If so, how do I add or subtract characters?

Angola 2015:
Idhspsu = 2401000001
Idhsstrata = 240100018

Lesotho 2014
idhspsu: = 42603000180
Idhsstrata: = 4260300016

Zimbabwe 2015
idhspsu: = 71606000390
Idhsstrata: = 7160600017

thanks - Yawo

It looks like you are referring to IPUMS DHS “harmonized” DHS variables. In our system, all values of the IDHSSTRATA and IHDSPSU variables will have the same number of characters no matter which country the data come from. To correct for different number of characters between countries, these variables are padded with leading zeros. Depending on which statistical software you are using, these leading zeros might be dropped. You can “fix” this detail by editing the format of these variables in your data. Whether or not this matters will depend on how you aim to use these variables. I’ll say in most instances, including leading zeros or not will not make any difference.