Strange values form sryvr package

Evan_Berkowitch · July 27, 2021, 12:19am

I recently loaded ACS data into the sryvr package (R) to account for the design variables. However, when I tried totaling the population of NYC metro, the number was ~ 18,000

Here’s the code:

Used Does anyone have sample code for using svydesign function in R? - #2 by gfellis

Loads the dataframe into data structure that takes into account other survey variables

METRO 35620 is NYC. Use it a example metro

svy ← (data) %>%
as_survey(ids = CLUSTER, probs = PERWT, strata = STRATA, nest = TRUE) %>%
filter(MET2013 == 35620, SAMPLE == 201901)

Attempt to find sample total (Number way too small)

svy %>% summarise(survey_total())

Is there something wrong with my syntax, am I misinterpreting the results, or is there some other issue? Thanks!

Grace_Cooper · August 2, 2021, 7:20pm

Your code is close; I believe the issue may be related to your use of the ‘probs’ argument instead of the ‘weights’ argument to assign weights within the as_survey() function. For simplicity sake, I find it useful to first create the survey object using the as_survey_design() function within SRVYR:

svy_object ← data %>% as_survey_design(
ids = CLUSTER,
weights = PERWT,
strata = STRATA,
nest = TRUE)

Then, I run summary statistics on the survey object using DPLYR:

svy_object %>%
filter(MET2013 == 35620, SAMPLE == 201901) %>%
summarize(survey_total())

When I run the above code, I get a coefficient of 19,839,535 which is consistent with the expected 19 million population of New York City. This vignette on srvyr compared to survey provides example code for using srvyr to calculate population estimates that you might find useful. In addition, keep in mind that you can pull up R Documentation within RStudio for most packages and functions by typing “?” followed by the package or function name in the console.

Evan_Berkowitch · August 2, 2021, 7:33pm

Thank you for the reply! I’m currently trying to group the survey design by year and calculate sums among a subgroup(~76,000 observations.)

When I ran the code, I waited for over ten hours and the results didn’t load. It works for a smaller subset, however.

Here’s the code:

svy is the survey design of dataset

priority_to_origin ← svy %>%
filter(MET2013 %in% Priority$Code, MIGMET131 %in% Origin$val)

Looking to show number of samples by year in this subset.

priority_to_origin %>% group_by(YEAR) %>%
summarise(Total = survey_total())

Grace_Cooper · August 16, 2021, 5:53pm

Running an analysis on a sample of that size should not be an issue; the srvyr package should be able to handle creating survey objects for multiple ACS samples. This is likely an issue related to your RStudio program that is separate from IPUMS data or the srvyr package. You may want to try reinstalling RStudio to see if that fixes the issue.

Topic		Replies	Views
Know of any tutorials for using the R survey package with ACS PUMS? Having problems weighting data.	4	1134	May 10, 2017
How can I use replicate weights to create standard errors in R? CPS	16	6606	May 24, 2023
How do you deal with svyset's STRATA when using Census decenial data and ACS data together? USA	4	1558	February 24, 2016
Does anyone have sample code for using svydesign function in R? USA	1	6366	July 18, 2018
Regarding the survey command USA	2	22	March 10, 2025