Hello, I am interested in estimating the number of people who spend any time providing care to a household member by year for the years 2015-2021. Based on the documentation, my understanding is I should be using the BWT variable, but it is not available for all the years I am interested in examining.
As you noted, BWT would be the proper weight to use for person-level analyses that do not involve time use variables, but it is not available for all years. For most person-level analyses over time (both with and without time use variables), IPUMS recommends using WT06, which is available from 2003-2019, and 2021. Within a year, WT06 sums to the number of person-days in the country in a quarter. In order to get estimates of person counts using these weights, you’ll need to divide them by the number of days in your time period of interest. If you are making annual estimates, you should divide the weights by 365 (or 366in a leap year). If you are doing a quarterly analysis, here are the numbers to divide by:
First quarter, Q1: 1 January – 31 March (90 days or 91 days in leap years)
Second quarter, Q2: 1 April – 30 June (91 days)
Third quarter, Q3: 1 July – 30 September (92 days)
Fourth quarter, Q4: 1 October – 31 December (92 days)
For 2020, there are different weights (WT20), because the survey was paused during the beginning of the COVID pandemic. For that year, you’ll need to subtract the excluded days (Mar 18 - May 9) from the total days in the year or quarter, before dividing the weight.
For more information on weights, please consult the ATUS User’s Manual, chapter 7.
Does this accounting go for CLUSTER and STRATA as well? Do you need to renumerate strata and cluster by year if you made a multiyear extract? (e.g. egen new_cluster = group(cluster year) )
The ATUS user guide recommends calculating standard errors with replicate weights (RWT06). The Census Bureau produced these replicate weights by using what is known as the Successive Difference Replication (SDR) method, which involves repeated implementations of the initial weighting algorithm. RWT06 is available across all samples aside from 2020, when RWT20 is available instead. This page on CPS replicate weights provides information on how to use these weights in Stata.
If you were to go the route of calculating standard errors using STRATA and CLUSTER, you should note that these are not available across all sample years. Also, while I believe the clusters and strata should be renumerated to be year-specific due to annual resampling, I am not aware of any formal guidance from BLS for how to handle these in pooled extracts.
Thank you for the detailed reply. Fortunately, I am only working with 2003 - 2013, so strata and cluster are available for them all, but I take your meaning.
To elaborate, my concern was that the CPS replicate weights are frequency weights, while the ATUS weights are probability weights, so I wanted to make sure I was correctly accounting for that difference. Since (I think) STRATA and CLUSTER are MPC products, I wanted to know if they were capable of pooling.
It sounds like it might work for back of an envelope estimates, but that the Census recommended approach is to use replicate weights or stratify my analysis by year.
I wanted to follow up on a few more points to help better clarify the issue.
The first is that ATUS replicate weights are derived from CPS replicate weights. The process of using replicate weights is described in the BLS ATUS user guide and is identical to the one on the linked CPS page. Secondly, STRATA and CLUSTER are not MPC or IPUMS products, but are original variables released by the BLS in the ATUS as GEPSEUST and GEPSEUCL. The BLS stopped providing these variables in 2014. The last point is that while you can stratify your analysis by year, the Census recommended approach is specifically to use replicate weights for variance estimation.
Greetings! As a follow up, I want to make sure I am specifying my sampling protocol correctly.
I get that the new protocol is to use the series of Replicate weights, however for the specific project I am working on, I need to use WT06, so I want to use CLUSTER and STRATA as well.
Previously, I had been using the folllowing Stata setup, but I got, oddly, smaller standard errors: egen cluster2 = group(year cluster); svyset cluster2 [pw = wt06] , strata(strata) singleunit(centered) ;
But if I am reading this right, if I have am combining many survey years to run a repeated cross setional analysis, I would set up the sampling thusly in Stata: svyset year || cluster [pw = wt06] , strata(strata) singleunit(centered) || cpsidp ;
I think that you might be a bit confused on the difference between these different weights. Usage of replicate weights (RWT06) does not preclude the usage of probability weights (WT06). In fact, it’s typically recommended to use both of these within a single analysis. Replicate weights are used for variance estimation and are produced through the Successive Difference Replication (SDR) method, which involves repeated implementations of the initial weighting algorithm. Probability weights on the other hand represent the probability of inclusion in the sample. Using replicate weights will affect your standard errors, but (unlike probability weights) not your point estimates. The way to specify this analysis in Stata is:
Regarding your proposed method, while I believe the clusters and strata should be remunerated to be year-specific due to annual resampling, I am not aware of any formal guidance from BLS for how to handle these in pooled extracts. The Census recommended approach is specifically to use replicate weights for variance estimation.
Thank you for the reply. And to this end, I notice that the suggested weighting scheme you posted based on Census protocol does not include strata and cluster in the sampling units. Does this mean that a researcher who wants to make external valididty claims using this data should no longer use strata and cluster? Is the use case for those terms simply for historical archiving or am I misinterpreting the intent of that last post?
STRATA and CLUSTER are being retained since this data is already available and because some researchers may want to replicate their own findings or the findings of another paper that used these variables in the past. The Census Bureau states that replicate weights are “necessary for generating standard errors for ATUS estimates.” Since replicate weights cannot be combined with STRATA and CLUSTER, using these parameters for variance calculation would be ignoring this requirement. Whether this allows researchers to make external validity claims is up to the researcher to determine based on the literature and their understanding of the econometric theory as presented in the ATUS User Guide (section 7.5).