Know of any tutorials for using the R survey package with ACS PUMS? Having problems weighting data.


Here’s the syntax of my query in R:

> <- svydesign(id=~serial, strata=~strata, data=workers1, weights=perwt)

And here’s the result:

Error in inherits(weights, “formula”) : object ‘perwt’ not found

“Perwt” is a variable in the data frame workers1, so the error message doesn’t make sense. Serial and strata (a combo of statefip and puma) also are variables in the data frame. I’m trying to create svydesign, a more-or-less mandatory element in the survey package, so that I can run some statistical procedures on the data.



To answer your immediate question: in my data set, perwt is capitalized and R is case-sensitive. “PERWT” works for me – you also need to be using parenthesis around the weights variable because it’s not an object, it’s a column name.

I haven’t found a good way to use weights in R, but I have been using data.table which aggregates and summarizes quite nicely with weighted data. In may be an option you choose to explore. For my purposes, the svydesign package isn’t quite right.




What do you mean that library(survey) isn’t quite right? If you are not using it, you are most likely getting your standard errors wrong. (Well, may be you simply don’t care about standard errors, which would be a different story.)



Thanks, Dillon. I’m finding the survey package more than a little frustrating. I hadn’t realized that data.table was good for weighting data. I’ll look into that solution.



I meant quotations, not parenthesis! Ack.

You’re right, it messes up the standard error. I’ve had to recalculate – again using data.table. The survey design package has a standard set of statistics that are used for calculation and I find it very difficult to analyze data.