Sampling Design Mexico 2010


I am using Mexico 2010 data and doing an individual-level analysis.
I read all the available files about Mexico 2010 survey design, and I have a few doubts. I am using Stata to analyze the data, and therefore I need to declare the survey design.

  1. I found that it was a one-stage stratified cluster sample by the municipality. Enumeration areas (blocks of dwelling within a locality) selected by simple random sampling with strata. Also, the sample unit is specified as dwellings(serial).

Now, in Stata, there are a few questions before declaring the survey design.

a) Number of stages
b) Primary Sampling Unit (PSU)
c) Strata
d) Sampling weight

What I understand is that the
a) Number of stages- 1
b) Primary Sampling Unit- Household(serial)
c) Strata- Municipality(geolev2)
d) Sampling weight- Household_weight(hhwt)

If this is correct, then when I am doing individual-level regressions, I am also using [pweight=person_weight(perwt)].

  1. If the method mentioned above is correct, then ignore this question.
    Is it a two-stage sampling? As the clustering was done at both geographic and household level.

  2. Can the undercount of 1.3% from the post-enumeration survey be taken into account somehow? Specifically in Stata?

This is a one-stage stratified sample survey, stratified by municipality and clustered by enumeration area (link for this sample’s design info). The PSU is the enumeration area, and it’s currently only available as a Source Variable in IPUMS-I (MX2010A_PMU and MX2010A_PMUP, which are identical). The Strata is currently not available on IPUMS-I, but please contact for more assistance on this.

The weight is PERWT for person-level analysis, and HHWT for household-level analysis. There is no additional clustering at the household level, because all households within the enumeration area were interviewed.

I’m not an expert on complex survey statistics, but generally the undercount will already be incorporated into the weights.

Thanks Matt!