Weights for calculating employment transitions with longitudinal CPS BMS

Jacob_Greenspon · July 31, 2024, 8:25pm

Hello,

I’m analyzing the probability that individuals employed in a certain industry will transition to employment in every other industry by matching individuals’ CPS BMS responses across each month that they are in the survey (i.e. up to 8 ‘waves’, for mish=1,…,8). I have constructed the dataset such that I now have observations for each individual and a series of variables origin_wave and destination_wave that record the value of variable mish for whichever monthly survey wave they last reported employment in my industry of interest (=1, 2, …7) as well as the next monthly survey wave that they reported employment in a different industry (=2, 3, …, 8), respectively. So for example I have data that looks like:

        +---------------------------------------------------------------+
        |      cpsidp   origin_wave   destination_wave   destination_ind|
        |---------------------------------------------------------------|
    13. | 2.00001e+13          7               8                    250 |
    69. | 2.00001e+13          1               2                    241 |
    77. | 2.00001e+13          4               5                    522 |
    93. | 2.00001e+13          1               4                    321 |
   165. | 2.00001e+13          3               7                    180 |

I now want to combine all these transitions into a single calculation of the probability of transitioning to each industry. What are the appropriate weights to use when combining all these transitions into a single calculation? In particular, can I mix together the different weights described in the documentation as follows: for each individual’s transition I assign a weight that depends on which of their monthly waves are in the values of origin_wave and destination_wave:

if adjacent months (at any point, e.g. mish=1 and 2; 5 and 6; or 6 and 7, etc) → use lnkfw1mwt
if not adjacent months but both are within the first 4 months of the panel rotation (mish 1-4) → use lnkfwmis14wt
if not adjacent but both are within the last 4 months of the panel rotation (mish 5-8) → use lnkfwmis58wt
if any two months across the break (i.e. mish1-4 and 5-8) → use lnkfwmis45wt

Is this approach correct, or is there something else that would be better?

Thank you in advance for your help!

Ivan_Strahof · August 12, 2024, 9:32pm

I am not aware of any official guidance from the Census Bureau for such an analysis, but I can share some guidelines on using these longitudinal weights.

WTFINL is used to weight respondents in a single or in multiple pooled basic monthly surveys in order for the sample(s) to be representative of the US noninstitutional population as a whole. The weight is based on the inverse probability of selection into the sample (and a few additional adjustments). When analyzing a linked sample however, you are correct that the linking weights should be used in order to maintain representativeness. This is because the sample of respondents who are linked between two months (e.g., January and February 2024) are not selected with the same probability as the combination of respondents who are sampled in either of the two months. LNKFW1MWT is created by IPUMS for each sample of respondents who link across adjacent months by adjusting the WTFINL values using iterative proportional fitting (ipf) or raking of the new linked sample based on intersections of a set of demographic and geographic variables. Based on the description of your research question, your weighting strategy using LNKFW1MWT and LNKFW45WT seems correct to me.

However, your strategy using LNKFWMIS14WT and LNKFWMIS58WT needs to be modified. LNKFWMIS14WT is created by raking the sample of respondents who link across all four of their first months in sample (MISH). It represents the inverse probability of linking across all four of these months. As a result, it is incorrect to use this weight for a transition with an origin in MISH 1 and a destination in MISH 4. It will also equal 0 for respondents who are not linked across these months in the sample. Therefore, you will need to create your own linking weights for the links that are not covered by LNKFW1MWT and LNKFW45WT (e.g., MISH 1 to MISH 4). In the Weighting Linked Datasets section, we provide Stata replication files that you can use to create your own linking weights. I also strongly recommend reviewing what other researchers in the literature have done before creating your own weights to determine if this is the method that you want to pursue for your project.

Jacob_Greenspon · April 24, 2025, 10:35pm

Thank you Ivan for the very helpful and detailed answer. Just a note for anyone finding this thread later that the Stata replication files used to create your own linking weights use the ipfraking command that seems to break in Stata 16 and later versions. So, users must put the command version 15 at the beginning of the do file when running it on later versions of Stata.

Topic		Replies	Views
Proper linking weight for observations in 4 consecutive months CPS	3	502	June 6, 2022
Linking basic months data across four consecutive months CPS	5	821	February 1, 2021
Reweighting after imputation	6	648	April 25, 2022
Weights for Monthly Occupation-Level Wages From CPS? CPS	16	1515	April 26, 2021
Creating Variables from CPS Basic Monthly Data - When to apply weights, filter, group, sum up CPS	3	27	June 20, 2025

Weights for calculating employment transitions with longitudinal CPS BMS

Related topics