How can I use replicate weights to create standard errors in R?

bmccomas · October 2, 2018, 9:25pm

I am using R to analyze CPS data on household income and would like to use the replicate weights to create standard errors.

I am aware that such a code exists in STATA and other statistical software but am having issues translating this to R.

gfellis · October 5, 2018, 3:09pm

Post edited 12/23/2020 to correct a typo

Note that the CPS weighting system has changed a little bit this year, and not all of our documentation has been updated. There used to be just one variable name across all supplements (WTSUPP), but now the variable name depends on which supplement you are using. Here are some examples using ASEC data (and so use ASECWT), but you can see the chart here to see what variable you should use:

https://cps.ipums.org/cps/weights_ren…

And here are some examples using first the survey package, and then the srvyr (which is based on survey, but uses dplyr syntax).

library(ipumsr)

library(dplyr)

Read data and some light data formatting

data <- read_ipums_micro(“cps_00021.xml”)

#> Use of data from IPUMS-CPS is subject to conditions including that users should

#> cite the data appropriately. Use command ipums_conditions() for more details.

data <- data %>%

mutate(

AGE = as.numeric(AGE),

SEX = as_factor(SEX),

INCTOT = as.numeric(lbl_na_if(INCTOT, ~.val >= 99999990))

)

R (survey package) -----

If not installed already: install.packages(“survey”)

library(survey)

svy <- svrepdesign(data = data, weight = ~ASECWT, repweights = “REPWTP[0-9]+”, type = “JK1”, scale = 4/160, rscales = rep(1, 160), mse = TRUE)

Calculate mean of INCTOT

svymean(~INCTOT, svy, na.rm = TRUE)

#> mean SE

#> INCTOT 42526 383.64

Calculate a mean of INCTOT, on the subset of people aged 25-64

svy_subset <- subset(svy, AGE >=25 & AGE < 65)

svymean(~INCTOT, svy_subset, na.rm = TRUE)

#> mean SE

#> INCTOT 51407 496.95

Calculate the mean of INCTOT by SEX

svyby(~INCTOT, ~SEX, svy, svymean, na.rm = TRUE)

#> SEX INCTOT se

#> Male Male 53196.41 637.2199

#> Female Female 32456.95 325.3275

R (srvyr package - uses dplyr-like syntax) -----

If not installed already: install.packages(“srvyr”)

library(srvyr)

svy <- as_survey(data, weight = ASECWT, repweights = matches(“REPWTP[0-9]+”), type = “JK1”, scale = 4/160, rscales = rep(1, 160), mse = TRUE)

Calculate mean of INCTOT

svy %>%

summarize(mn = survey_mean(INCTOT, na.rm = TRUE))

#> # A tibble: 1 x 2

#> mn mn_se

#> <dbl> <dbl>

#> 1 42526. 384.

Calculate a mean of INCTOT, on the subset of people aged 25-64

svy %>%

filter(AGE >= 25 & AGE < 65) %>%

summarize(mn = survey_mean(INCTOT, na.rm = TRUE))

#> # A tibble: 1 x 2

#> mn mn_se

#> <dbl> <dbl>

#> 1 51407. 497.

Calculate the mean of INCTOT by SEX

svy %>%

group_by(SEX) %>%

summarize(mn = survey_mean(INCTOT, na.rm = TRUE))

#> # A tibble: 2 x 3

#> SEX mn mn_se

#> <fct> <dbl> <dbl>

#> 1 Male 53196. 637.

#> 2 Female 32457. 325.

Jamaal_Green · September 23, 2019, 5:16pm

I’m working with the ASEC file to estimate TANF participation. I found this forum for specifying the survey design in R, but when looking here on Anthony Damico’s site on complex survey design: http://asdfree.com/current-population-survey-basic-monthly-cpsbasic.html the type and row parameters are different. Is this a mistake on Damico’s part? Should I follow this approach? Additionally are there are any reference tables to make sure estimates are correct? The ACS PUMS provide state level estimates to check to make sure your survey design is correct. Is this available for CPS?

JeffBloem · September 24, 2019, 2:23pm

One key difference between the approach detailed on the linked webpage and the approach noted on the IPUMS Forum above, is that the later integrates the replicate weights provided with the CPS data while the former does not. These are two distinct ways of calculating standard errors. More information about replicate weights is available here. Regarding any previously calculated statistics using the CPS data, I’ll direct you to the BLS website. They list a number of tables with published statistics using the CPS data.

Jamaal_Green · September 24, 2019, 6:43pm

Excuse me. I meant to send the ASEC link that does include the replicate weights. I just want to make sure the approach linked in the above post is appropriate for calculating proper standard errors with the survey package? And I was curious if there are any reference files to double check the estimates like one can do with the ACS values?

JeffBloem · September 25, 2019, 3:54pm

Yes, the R packages survey and srvyr can help facilitate specification of sample design with the ipumsr package. Regarding any previously calculated statistics using the CPS data, I’ll direct you to the BLS website. They list a number of tables with published statistics using the CPS data.

Philippe_Lemoine · December 12, 2019, 4:07pm

This is very helpful, but in the call to svrepdesign/as_survey, shouldn’t the value of scale be 4/160 instead of 4/60? According to IPUMS CPS, the multiplier in front of the sum of squared deviations in the formula for the standard error is 4/160, not 4/60. Am I missing something?

Molly_Richard · December 21, 2020, 10:20pm

Do you have any similar sample code for using replicate waits within these packages for IPUMS USA (ACS files)?

Matthew_Bombyk · December 23, 2020, 7:21pm

You’re correct that there was a typo in the earlier post. I have edited that post to use the correct denominator of 160 in the survey design specification step. Thanks for pointing this out, and sorry for the year-late follow up, your post must have slipped through the cracks!

Matthew_Bombyk · December 23, 2020, 7:37pm

The code for ACS samples should be nearly the same. The IPUMS USA page on replicate weights gives details on the calculations: https://usa.ipums.org/usa/repwt.shtml

I’ll refer you to the earlier post in this thread by @gfellis giving example code for using replicate weights with ASEC data. Apart from specific variables used in analysis, the code will work more or less unchanged for ACS, with a few modifications to the survey design specification. Below I’ve highlighted the things you would need to change, depending on whether you’re using -survey- or -srvyr- packages:

Using -survey-

ASEC:

svy <- svrepdesign(data = data, weight = ~ASECWT, repweights = “REPWTP[0-9]+”, 
      type = “JK1”, scale = 4/160, rscales = rep(1, 160), mse = TRUE)

ACS:

svy <- svrepdesign(data = data, weight = ~PERWT , repweights = “REPWTP[0-9]+”,
      type = “JK1”, scale = 4/ 80 , rscales = rep(1, 80 ), mse = TRUE)

Using -srvyr-

ASEC:

svy <- as_survey(data, weight = ASECWT, repweights = matches(“REPWTP[0-9]+”), 
      type = “JK1”, scale = 4/160, rscales = rep(1, 160), mse = TRUE)

ACS:

svy <- as_survey(data, weight = PERWT , repweights = matches(“REPWTP[0-9]+”), 
      type = “JK1”, scale = 4/ 80 , rscales = rep(1, 80 ), mse = TRUE)

Cela · October 21, 2021, 9:34pm

@Matthew_Bombyk (or other IPUMS Staff :D)

If I can revive an old thread… Can I confirm that I am repurposing the code correctly for weighting at the household level? For instance, if my goal is to aggregate data at the household level for CPS ASEC, I should be using:

-survey package-

svy <- svrepdesign(data = data, weight = ~ ASECWTH, repweights = “ REPWT[0-9]+”, 
      type = “JK1”, scale = 4/160, rscales = rep(1, 160), mse = TRUE)

Of note is using REPWT over REPWTP, and using ASECWTH over ASECWT. My understanding is we don’t need to change the parameters in type, scale, or scales?

Thanks so much!

Matthew_Bombyk · October 22, 2021, 1:53pm

That looks right to me.

Cela · October 22, 2021, 2:01pm

Awesome. Thanks @Matthew_Bombyk .

Jana_Sessler · January 4, 2022, 2:37pm

I tried to follow exactly the steps explained above, but always get this error:

Error in if (combined.weights & probably.not.combined.weights) warning(paste(“Data do not look like combined weights: mean replication weight is”, :
missing value where TRUE/FALSE needed

does anybody have an idea what’s wrong ?

Many thanks in advance !

Matthew_Bombyk · January 7, 2022, 11:10pm

This may be a problem with your -survey- package. I recommend installing the latest version of the survey package. You can type:

install.packages("survey")

If you’re using RStudio, try this in base R first and see if the problem is fixed.

Katie_Savin · May 18, 2023, 2:57pm

Thanks for this code and instruction, it is very helpful.
I am trying to use CPS data for descriptive statistics on SSI recipients in California and their rates of SNAP receipt in particular.
I am trying to use the ASECWT for my code in Rstudio, however when I use the syntax you provided after installing the survey package, I am receiving this error messages I can’t clear:
Error in UseMethod(“as_survey”) :
no applicable method for ‘as_survey’ applied to an object of class “function”
Any advice?
Thanks in advance,
Katie

Ivan_Strahof · May 24, 2023, 3:02pm

It appears that RStudio is trying to apply the UseMethod function to the as_survey function, however this is not possible since as_survey is also a function. Would you be able to share where you are getting code to use replicate weights? IPUMS CPS provides this code on the replicate weights user guide. In RStudio, you first want to install the srvyr package:

install.packages(“srvyr”)
library(“srvyr”)

Then, run the as_survey function:

svy ← as_survey(data, weight = ASECWT, repweights = matches(“REPWTP[1-160]+”))

Topic		Replies	Views
Should standard errors computed with ATUS replicate weights be expected to match BLS published results? TIME USE	6	1010	August 14, 2017
Strata for CPS Survey CPS	2	357	January 26, 2021
Setting survey weights for ASEC in Stata CPS	1	995	March 30, 2022
Unable to reproduce weights from CPS Voting supplement CPS	1	498	April 25, 2018
Replicate weights and confidence intervals CPS	1	658	February 5, 2018

How can I use replicate weights to create standard errors in R?

Read data and some light data formatting

R (survey package) -----

If not installed already: install.packages(“survey”)

Calculate mean of INCTOT

Calculate a mean of INCTOT, on the subset of people aged 25-64

Calculate the mean of INCTOT by SEX

R (srvyr package - uses dplyr-like syntax) -----

If not installed already: install.packages(“srvyr”)

Calculate mean of INCTOT

Calculate a mean of INCTOT, on the subset of people aged 25-64

Calculate the mean of INCTOT by SEX

Using -survey-

ASEC:

ACS:

Using -srvyr-

ASEC:

ACS:

Related topics