Perwt variable question in individual-level analysis


I am using several datasets including 2000 Census 5% and ACS 2001-2021 and running some individual-level analyses.

I am wondering how to include the “perwt” variable in the analyses.

Should I use [fw=perwt] or [pw=perwt] or [aw=perwt]?

“perwt” indicates how many persons in the U.S. population are represented by a given person in an IPUMS sample.

I tried all of them. However, I got very different Standard Errors from [fw=perwt] and [pw=perwt] or [aw=perwt]. The SEs are much smaller with [fw=perwt] than with [pw=perwt] or [aw=perwt].

Could you please help me better explain which one I should use here and why?

Kind regards.

The proper way to treat this is as probability weights (pweight). PERWT is a sampling weight, and it is inversely proportional to the probability that a given individual was selected for the ACS sample. Typing -help weight- in Stata will give an overview of the type of weight appropriate for each of the weighting options in Stata.

PERWT is the proper weight to use in making point estimates (for person-level analyses), but to get correct standard error estimates for the ACS you should be using replicate weights (IPUMS variable REPWTP), which fully account for the complex sampling design of the ACS survey. The documentation for REPTWP describes the weights in detail and links to this page, which includes example code for how to use the replicate weights in Stata.


Thanks a lot for your help.

Do you mean that PERWT might not get the correct SE in individual-level analyses?

As a result, I could not get the correct p-value or t-stats there.

I need to use REPWTP to run regressions with [pw=REPWTP] and get the correct SEs.

Thanks again for your help.

Yes, that is correct. Just using [pweight=PERWT] will not give correct standard error estimates. Generally the SE’s are somewhat larger when replicate weights are used, but this is not always the case.