Rep weights and full samples


I understand you are supposed to have the full sample downloaded to use rep weights. If I’m using household rep weights (ACS 2012-16), do I need to keep all person level observations? Does full sample include group quarters?

Secondly, is it impossible to test models on a random subsample and still use survey weights properly? I’m having issues with the amount of time Stata is taking to run these models and didn’t know if there was a way to speed up the process.

Thank you!


An accessible explanation of what’s doable and not doable and likely erroneously doable with subsamples is

If you rely on replicate weights, I would argue you don’t have to use the full sample.

Full sample does include the group quarter observations; you can use RELATE variable or GQTYPE to identify them.

As a preliminary step, I would analyze my models with the main weight only [perwt] as far as I can recall off the top of my head what the variable name is, and expect that my standard errors could be off by a factor of at most 2. Then once I have models that I am satisfied with, I would move on to the full analysis. To make it easier in Stata, you can have

* fake settings: uncomment for prelim analysis

* svyset [pw=perwt]

* real settings: uncomment for the ultimate analysis

* svyset [pw=perwt], vce(sdr) sdrw(perwtp1-perwtp80) mse

svy : model whatever

Then when the time comes, you can run it with the real thing.