See the below image. It seems that many of the PSU level variables in IPUMS are missing for NFHS 3 even when the variables that are being used to create these estimates exist for this round.
For instance: IMPWATERPSU is missing for round 3 even though DRINKWTR, PSU and HHWEIGHT are available for NFHS 3. I want to understand if there is any reason for systmatically excluding these variables for this round
“IMPWATERPSU reports the percentage of households that used an improved source for drinking water, based on the variable [DRINKWTR], in the primary sampling unit. This variable has been calculated by IPUMS using Chapter 16 of the [DHS Guide to Statistics 8and weighted using the variable [HHWEIGHT].”
The India Standard DHS 2005-06 (NFHS-3) contained an HIV testing component that necessitated scrambling of PSU numbers in order to preserve respondent confidentiality. As a result, IPUMS DHS did not create PSU-level measures. From reviewing the documentation, it is not clear to me whether this scrambling retained the same households within each PSU or whether households were scrambled across PSU. I recommend reaching out to the DHS Program to confirm the procedure used. In the former case, you may merge an IPUMS DHS extract with a data extract from the DHS program to add PSU identifiers to your observations and calculate these variables.
Would you be able to share the URL of where you obtained the screenshot from so that I can take a closer look? ROOMPPSU is not a variable that’s available on IPUMS DHS. A similar variable on IPUMS DHS, ROOMPPPSU, is not available for the India 2005 sample.
Thanks for the response Ivan. I will check this with the DHS team. That is a screenshot from a document I created, the variable I was referring to was ROOMPPPSU. Thanks!
Can you please send me the DHS documentation that you mention above?
The documentation I referenced is the NFHS-3 Supplemental Documentation that was included with my download of the survey dataset from the DHS Program. You can download the survey data with this document from the summary page on the DHS Program website.
The IPUMS DHS team looked into the issue and determined that we will be integrating the scrambled PSU codes into IPUMS DHS and releasing the PSU-level variables for this sample. It may take some time to update these variables on the website. To get the data sooner, you can download the PSU identifiers directly from the DHS Program and merge this with an IPUMS DHS extract that contains the variables that you are interested in. Alternatively, you can also use CLUSTERNOALL (available for the India 2005 sample on IPUMS DHS), which reports the cluster number for household members. In both cases, you will need to calculate the PSU or cluster-level weighted averages yourself. To do so, you should restrict your sample to one person per household (rather than counting records for each household member) and calculate a weighted average using HHWEIGHT. The weighted estimate should then be multiplied by 100 to convert the fraction to a percentage.
Thank you for letting us know about this! We’re happy to be able to add new contextual variable data to help researchers interested in this sample in the future.
Thank you for the update!