@JeffBloem Thanks for the reply. The 3-year data was discontinued in 2012, so to balance currency of data with sample size, I am trying to combine three 1-year estimates.
I was concerned with using the replicate weights, but I think I just solved the issue by using the STRATA and CLUSTER variables. Can you answer these questions for me?
- Does IPUMS calculate in-house replicate weights instead of reporting Census PUMS replicate weights?
- Does IPUMS offer CLUSTER and STRATA as optional methods for calculating standard error? --> And so are REPWT and REPWTP simply separate variables that an analyst could use to estimate the standard errors?
- Why would Census PUMS be better than IPUMS for combining three 1-year files?
Last, can you tell me if this seems correct? The results really do seem accurate (not only the estimate, but the margins of error seem appropriate when the sample sizes are cut).
- Take 2016, 2017 and 2018 ACS 1-year estimates (IPUMS).
- For household-level estimates, divide HHWT by 3† and filter for PERNUM == 1.
- Specify the survey design using the CLUSTER and STRATA fields as well as the revised HHWT field.
My code in R:
h <- pums_cleaned %>%
filter(PERNUM == 1) %>%
srvyr::as_survey_design(ids = CLUSTER,
strata = STRATA,
weights = HHWT_3)
result <- h %>%
filter(YEAR %in% c(2014, 2015, 2016)) %>%
summarize(hh = survey_total(na.rm = T),
count = unweighted(n())) %>%
mutate(hh_moe = hh_se * 1.645,
hh_cv = hh_se / hh,
hh_reliability = case_when(hh_cv > 0.4 ~ "3. Unreliabile",
hh_cv <= 0.4 & hh_cv > 0.2 ~ "2. Use with caution",
hh_cv < 0.2 ~ "1. Use"))
†: Your advice about dividing not by 3 but by the weighted sample size is noted—great point!