Weighting for subpopulation analysis

Heeun_Kim · July 13, 2023, 3:09pm

Hi, I am working on a series of analyses using IPUMS ACS data that I subsampled with women at reproductive ages in 2019-2021. I did it by selecting cases when I extracted the data from IPUMS before downloading it.

So using this ‘women-at-reproductive age-only’ data, I first used the replication weight that IPUMS recommended. I am using Stata, so I used the svyset and svy prefix for the regressions. However, while I was working on that, several questions came to my mind.

First, can I use the replication weight when I use subsampled data that contain only women at reproductive age observations?
Second, using replication weight yields smaller standard errors than using person weight (PERWT) and clustered standard errors at the state level. I wonder which approach is correct.

Thank you!

Ivan_Strahof · July 14, 2023, 12:03am

Regarding analyses with subsamples of the data, IPUMS recommends using the subpop() option in Stata after svysetting the data. The user guide notes that users must first define the subpopulation with a dichotomous variable coded 0 for all cases that should be excluded from the analysis. The decision of which method to use to generate standard errors is ultimately up to the researcher. The Census Bureau in their design and methodology report however notes that users of the ACS PUMS files can compute the estimated variances of their statistics using one of two options: (1) the replication method using replicate weights released with the PUMS data, and (2) the design factor method. The report states that in general, the replication method produces significantly more accurate variance estimates and is strongly recommended over the design factor method whenever practical. This paper offers some additional insight into different methods of calculating standard error with IPUMS data products and how the outcomes may vary.

Heeun_Kim · July 14, 2023, 2:39pm

Hi Ivan, thank you so much for your helpful comment!

Heeun_Kim · August 15, 2023, 12:33am

Hi, I just wanted to ask a couple of follow-up questions.
I would like to use a mixed model that includes state random intercepts. However, I have encountered an issue where replicate weights do not seem to function correctly with multilevel models in both STATA and R.

In Stata, I used svyset[pweight=perwt], vce(brr) brrweight(repwtp1-repwtp80) fay(.5)mse. Then I tried several mixed models then it returned an error message that mixed models are not supported by svy with vce(brr).

In addition, in R, I tried the package WeMix which supports weighted mixed models but it doesn’t seem to work with the replicate weights.

Is there any known method for performing multilevel modeling with the ACS replicate weights?

Ivan_Strahof · August 16, 2023, 5:18pm

Replicate weights and multilevel models are different procedures that account for clustered sampling. I am not aware of any methods that allow for multilevel modeling with the ACS replicate weights. Though the Census Bureau’s “Estimating ASEC Variances with Replicate Weights” document references the CPS ASEC, I recommend reviewing it to learn more about the construction of replicate weights.

Topic		Replies	Views
Pweight, aweight, replication weights USA	7	1327	September 26, 2023
Can I use replicate weights after using "Select Cases"? USA	1	516	August 18, 2016
Replicate weights for ACS pre-2004 USA	1	634	January 7, 2020
computing std errors using replicate weights in an ACS2016 extract. USA	1	442	March 6, 2018
How to generate multi-year estimates with variance estimation USA	5	1337	February 4, 2022

Weighting for subpopulation analysis

Related topics