Hello - I am generating 2-year ACS estimates by combining 1-year samples from 2021 and 2022. I would like to calculate standard errors. When I use 1-year or 5-year samples I use replicate weights - but because I’m creating my own pooled sample I’m not certain if this is possible. I have been dividing perwt and hhwt by 2 to create approximate weights in the pooled sample - can I do the same thing to the replicate weights provided?
Does anyone have any further guidance on creating your own multi-year estimates and calculating the standard errors?
My understanding is that you are wondering if you need to adjust replicate weights when pooling multiple 1-year ACS samples together in the same way that you need to adjust the weights. I am not aware of any explicit guidance for adjusting replicate weights when pooling multiple ACS samples and do not believe the adjustment that you suggest is necessary. I will outline why by describing the purpose of the weights versus the replicate weights, then share additional documentation that you might consider.
Weights (such as PERWT and HHWT) are designed to ensure that estimates based on your sample are representative of the population. They are additive; the weights should sum to the U.S. resident population in that year (and allow for estimates of demographic subgroups with sufficient representation in the sample). Accordingly, if you pool two adjacent 1-year ACS PUMS files, you are essentially double-counting the population in any total or count estimates. As you suggest, dividing the weights by two (or however many samples you are pooling) is appropriate to avoid this double-counting. This division is only necessary when estimating counts (i.e., the total number of person meeting criteria X) and is not necessary for ratios or proportions. The 5-year ACS PUMS files that the Census Bureau produces account for this pooling. Note that this division is not necessary if you are not estimating counts (i.e., for means or proportions).
Each person or household record in the ACS has 80 replicate weights. These replicate weights allow you to simulate multiple samples using a single sample, thereby generating more informed standard errors. The replicate weights produce variation in the estimate for an individual or household (and then summing the squared differences between these estimates and those generated using the full-sample weight to estimate standard errors). If you divide the replicate weights by 2, you will not be affecting the total counts–rather you will be artificially cutting the variability in these alternate estimates that inform your standard errors in half.
Another way I might frame your question is to think about replicate weights as a substitute for sample design variables. If you were instead using sample design variables, you wouldn’t adjust those to account for pooling (except in cases where strata or psu values are reused across years but do not refer to the same unit); you want to take advantage of the full information available when generating your standard errors. Similarly, you wouldn’t want to adjust your replicate weights.
I am linking to the ACS methodology report (version 3.0) as this is certainly a helpful reference for questions about weighting and variance estimation in the ACS PUMS data.