Hey folks,
I’m trying to estimate union membership change between 2019 and 2022. I’ve not used CPS microdata before and I’m not sure what weight I should be using along with the UNION variable. It looks like it might be EARNWT?
Right now I’m filtering it to the counties and years I need, excluding all (union == 0), and looking at the proportion of EARNWT/total EARNWT by year and county. I feel like I’m missing something critical here.
It sounds like something in your results is unexpected. If you can provide a bit more information on what you are seeing, I can provide more targeted feedback. In the meantime, I will share a few general comments.
- EARNWT is appropriate to use for analyses of UNION in the Outgoing Rotation Group (ORG)/Earner Study (the ORG is a subset of the basic monthly sample). Note that UNION is also included in the ASEC; person-level analyses of ASEC data should be weighted with ASECWT. In general, I would advise against using the basic monthly and the ASEC data in the same analyses–you will double-count persons who are included in the ORG.
- If you are using the ORG samples and including all months in 2019 and 2022, you would want to account for the pooling if looking at counts (though it sounds like you are looking at proportions right now).
- CPS is designed to be representative at the state-level, though there are sub-state geographies identified and people use these units in their research. I would expect more variation in sub-state estimates based on union status at the county-level for a single year of data.
Hey Kari, thanks for getting back so quickly.
I’m seeing a sudden jump in union membership for a county from about 21% in 2019 to 34% in 2020. There’s also a sudden increase of non-union but covered individuals in 2022 at 8.55%, from 0% in 2020.
I don’t believe I’m using ASEC data, under sample selection I’ve chosen “BASIC MONTHLY” > Jan 2019, Jan 2020, Jan 2021 and Jan 2022.
I’m looking specifically at COUNTYFIP 36059 and 36103, excluding all UNION == 0, and looking at the proportions of EARNWT by UNION membership.
Let me know if I’m not detailed enough, I really appreciate the help with navigating this data.
The sample sizes you are dealing with are simply too small to create reliable population inferences (e.g., I get 15 union members as the unweighted frequency in Suffolk County in 2019; this is the highest unweighted count I see for union coverage in these counties and samples). In general, there is no bright-line rule regarding “too small to study,” though I can say that more is always better, and would treat any single-digit cell as too small. In practice, what will happen is the sampling error around estimated statistics will be relatively large and will, therefore, limit any informative interpretation from the data.
One way to increase the sample size of your estimates is to pool multiple samples together (e.g., across various months if you are using the basic monthly samples in IPUMS CPS). This will increase the number of observations in your data and the statistical precision but will limit the temporal precision of your analysis. Note that if you do pool together multiple samples you will need to adjust the sampling weights so that they properly account for the combined samples. An approximate way to do this is to divide the sampling weight by the number of samples you are pooling together.
If I’m pooling the samples together, for example a year’s worth - would it be simpler/better for me to use ASEC and ASECWT for the analysis instead?
If UNION is your focal variable, merging together multiple outgoing rotation groups (ORGs) is the best way to augment your sample size. If pooling multiple ORGs, you will need to adjust the values of EARNWT to generate accurate point estimates. Instructions for how to do this are at the bottom of these notes about the ORG from IPUMS CPS:
Researchers should use the earnings weight EARNWT for the outgoing rotation group/earner study variables. According to the NBER CPS Merged Outgoing Rotation Groups documentation, the sum of EARNWT for each month is the sum of the total population, so weights will need to be divided by 12 when pooling over one year.