Regarding the survey command

Hello, I am facing a question about the survey command:

I want to get the estimates for the population using the survey command. I am using the 1970-2023 files

Following is my code:
svyset [pw=perwt]
gen a=0
replace a=1 if statefip==22 & gradeatt==2 & schltype==2
svy, subpop(a): tab a

In the output, I get the subpop. no. obs as 1,742

I then run the code: sum if statefip==22 & gradeatt==22 & schltype==2
The number of observations I get is 1,743

Should these two numbers not be equal as they represent the number of observations in the sample? Is this a data error?

I took a look at your extract and determined that the reason for this discrepancy is that there is one observation in your subsample that has PERWT = 0. When you svyset the data using PERWT, observations with missing and 0 weights are by default not included in the subpopulation.

The presence of respondents with values of zero for PERWT in the 2000 1% file is intentional. It is a result of the Census Bureau’s weighting strategy that assigns lower weights to persons who are overrepresented in the sample; it seems plausible that additional rounding could result in weights of 0. You can read more about the weighting procedure in the 2000 1% PUMS technical documentation.

1 Like

Thank you for your clear response!