I’m trying to analyze NHIS nine years data (2008-16). I downloaded merged and appended these data from NHIS website. I also downloaded NHIS-IPUMS data for the same years. I use data to compare the results ( I was trying to create an exercise for my students). The unweighted frequency matches (see the output for sex below). However, when I ran weighted analysis in Stata results were different. The NHIS data gives me 300 strata and 600 PSUs whereas, NHIS-IPUMS gives me 352 strata and 1255 PSUs. The number of observations, population size, design df all are different. I used WTFA weight in NHIS analysis and PERWEIGHT in the NHIS-IPUMS analysis. Also, using obs option also gives me different frequency distribution! I wondering if anyone can understand what I’m I doing wrong here?
NHIS
. tab sex
HHC.110_00. |
000: Sex | Freq. Percent Cum.
------------±----------------------------------
1 Male | 309,782 47.36 47.36
2 Female | 344,327 52.64 100.00
------------±----------------------------------
Total | 654,109 100.00
. svy linearized : tabulate sex, obs percent format(%9.3g) miss
(running tabulate on estimation sample)
Number of strata = 300 Number of obs = 579,934
Number of PSUs = 600 Population size = 1,867,950,988
Design df = 300
HHC.110_0 |
0.000: |
Sex | percentage obs
----------±----------------------
1 Male | 48.3 274487
2 Female | 51.7 305447
|
Total | 100 579934
Key: percentage = cell percentage
obs = number of observations
NHIS IPUMS
`
. tab sex, miss
Sex | Freq. Percent Cum.
------------±----------------------------------
- 1.Male | 309,782 47.36 47.36
- 2.Female | 344,327 52.64 100.00
------------±----------------------------------
Total | 654,109 100.00
. svy linearized : tabulate sex, obs percent format(%9.3g)
(running tabulate on estimation sample)
Number of strata = 352 Number of obs = 654,109
Number of PSUs = 1,255 Population size = 2,113,089,258
Design df = 903
Sex | percentage obs
----------±----------------------
- 1.Mal | 48.3 309782
- 2.Fem | 51.7 344327
|
Total | 100 654109
Key: percentage = cell percentage
obs = number of observations