I’m running a difference in difference regression using Mexico’s 2010 census data. I want to compare earnings of individuals in municipalities that received some program vs. earnings in those that didn’t. I am trying to understand how to use the sample weights for the census and what are the relevant strata.
Am I correct in thinking that I need to specify:
svyset serial [pweight=wthh]
before running the regression? (eg svy: reg x y)
Any references on this would be extremely appreciated.
The Mexico 2010 sample requires weighting and is clustered by both municipality (strata) and household. As a result, researchers should adjust for the household-level clustering, as well as use weights to account for the complex survey design. Your Stata code appears consistent with this approach. Using the weight variable should account for stratification, although you might consider explicitly identifying municipality as the strata in your svyset command.
For a more complete discussion of sampling error and variance estimation, please refer to this IPUMS-International User’s Note.