I’m trying to analyze NHIS nine years data (200816). I downloaded merged and appended these data from NHIS website. I also downloaded NHISIPUMS data for the same years. I use data to compare the results ( I was trying to create an exercise for my students). The unweighted frequency matches (see the output for sex below). However, when I ran weighted analysis in Stata results were different. The NHIS data gives me 300 strata and 600 PSUs whereas, NHISIPUMS gives me 352 strata and 1255 PSUs. The number of observations, population size, design df all are different. I used WTFA weight in NHIS analysis and PERWEIGHT in the NHISIPUMS analysis. Also, using obs option also gives me different frequency distribution! I wondering if anyone can understand what I’m I doing wrong here?
NHIS
. tab sex
HHC.110_00. 
000: Sex  Freq. Percent Cum.
±
1 Male  309,782 47.36 47.36
2 Female  344,327 52.64 100.00
±
Total  654,109 100.00
. svy linearized : tabulate sex, obs percent format(%9.3g) miss
(running tabulate on estimation sample)
Number of strata = 300 Number of obs = 579,934
Number of PSUs = 600 Population size = 1,867,950,988
Design df = 300
HHC.110_0 
0.000: 
Sex  percentage obs
±
1 Male  48.3 274487
2 Female  51.7 305447

Total  100 579934
Key: percentage = cell percentage
obs = number of observations
NHIS IPUMS
`
. tab sex, miss
Sex  Freq. Percent Cum.
±
 1.Male  309,782 47.36 47.36
 2.Female  344,327 52.64 100.00
±
Total  654,109 100.00
. svy linearized : tabulate sex, obs percent format(%9.3g)
(running tabulate on estimation sample)
Number of strata = 352 Number of obs = 654,109
Number of PSUs = 1,255 Population size = 2,113,089,258
Design df = 903
Sex  percentage obs
±
 1.Mal  48.3 309782
 2.Fem  51.7 344327

Total  100 654109
Key: percentage = cell percentage
obs = number of observations