# How can I use replicate weights to create standard errors in R?

#1

I am using R to analyze CPS data on household income and would like to use the replicate weights to create standard errors.

I am aware that such a code exists in STATA and other statistical software but am having issues translating this to R.

#2

Note that the CPS weighting system has changed a little bit this year, and not all of our documentation has been updated. There used to be just one variable name across all supplements (WTSUPP), but now the variable name depends on which supplement you are using. Here are some examples using ASEC data (and so use ASECWT), but you can see the chart here to see what variable you should use:

https://cps.ipums.org/cps/weights_ren…

And here are some examples using first the survey package, and then the srvyr (which is based on survey, but uses dplyr syntax).

library(ipumsr)

library(dplyr)

# Read data and some light data formatting

#> Use of data from IPUMS-CPS is subject to conditions including that users should

#> cite the data appropriately. Use command `ipums_conditions()` for more details.

data <- data %>%

mutate(

AGE = as.numeric(AGE),

SEX = as_factor(SEX),

INCTOT = as.numeric(lbl_na_if(INCTOT, ~.val >= 99999990))

)

# If not installed already: install.packages(“survey”)

library(survey)

svy <- svrepdesign(data = data, weight = ~ASECWT, repweights = “REPWTP[0-9]+”, type = “JK1”, scale = 4/60, rscales = rep(1, 160), mse = TRUE)

# Calculate mean of INCTOT

svymean(~INCTOT, svy, na.rm = TRUE)

#> mean SE

#> INCTOT 42526 383.64

# Calculate a mean of INCTOT, on the subset of people aged 25-64

svy_subset <- subset(svy, AGE >=25 & AGE < 65)

svymean(~INCTOT, svy_subset, na.rm = TRUE)

#> mean SE

#> INCTOT 51407 496.95

# Calculate the mean of INCTOT by SEX

svyby(~INCTOT, ~SEX, svy, svymean, na.rm = TRUE)

#> SEX INCTOT se

#> Male Male 53196.41 637.2199

#> Female Female 32456.95 325.3275

# If not installed already: install.packages(“srvyr”)

library(srvyr)

svy <- as_survey(data, weight = ASECWT, repweights = matches(“REPWTP[0-9]+”), type = “JK1”, scale = 4/60, rscales = rep(1, 160), mse = TRUE)

# Calculate mean of INCTOT

svy %>%

summarize(mn = survey_mean(INCTOT, na.rm = TRUE))

#> # A tibble: 1 x 2

#> mn mn_se

#> <dbl> <dbl>

#> 1 42526. 384.

# Calculate a mean of INCTOT, on the subset of people aged 25-64

svy %>%

filter(AGE >= 25 & AGE < 65) %>%

summarize(mn = survey_mean(INCTOT, na.rm = TRUE))

#> # A tibble: 1 x 2

#> mn mn_se

#> <dbl> <dbl>

#> 1 51407. 497.

# Calculate the mean of INCTOT by SEX

svy %>%

group_by(SEX) %>%

summarize(mn = survey_mean(INCTOT, na.rm = TRUE))

#> # A tibble: 2 x 3

#> SEX mn mn_se

#> <fct> <dbl> <dbl>

#> 1 Male 53196. 637.

#> 2 Female 32457. 325.