Longitudinal link weights

Sheela_Kennedy · May 14, 2020, 4:26pm

What is the difference between a 0 and a missing code for the longitudinal weights? Does missing indicate not able to link (an unlinkable sample) and a 0 indicate that a link is theoretically possible but no link is made?

When someone in MISH 8 who receives a code of 0 for lnkfw1mwt – does this indicates but that links between that MONTH and YEAR are possible with the MONTH/YEAR of the subsequent MISH (+1 month or +8 months depending on MIS), even though this particular respondent isn’t eligible to link.

Thanks!

JeffBloem · May 15, 2020, 2:17pm

The difference between a longitudinal weight value of zero and a missing code is largely conceptual. In practice, observations associated with both of these codes will effectively be omitted from weighted analysis. As is discussed in the WTFINL variable description, the raw sampling weights in the basic monthly samples are “based on the inverse probability of selection into the sample and adjustments for the following factors: failure to obtain an interview; sampling within large sample units; adjustments to the known distribution of the entire population according to stage, age, sex, race, and Hispanic ethnicity; and allotting a weight of zero to populations not sampled in other monthly surveys (i.e., persons in the Hispanic oversample and members of the armed forces in ASEC samples).” Missing codes, on the other hand are not able to be linked for a variety of reasons, including: (i) that the individual should have responded to the questionnaire in the subsequent month or (ii) the individual is in the an incoming or outgoing rotation group (e.g., MISH==1,4,5, or 8).

Sheela_Kennedy · May 15, 2020, 4:37pm

Hi Jeff,

Thanks for your reply. I’m talking specifically about the longitudinally linking weights, e.g. LNKFW1MWT.

The documentation on IPUMS defines the universe for this variable “Those individuals who are eligible to link to the next month.” The vast majority (97% from 1976-present) of people who are in MIS 8 (and therefore not eligible to link to the next month due to survey design) receive a code of zero (0), rather than missing (.). If the universe statement was correct, all of these individuals would have received a missing code (.). Further, the documentation doesn’t define values of zero (0).

The remaining 3% receive a missing code. The missing codes, from looking at the data, appear to be concentrated in “unlinkable samples”, which is consistent with the statement “Users should note that availability of these variables in a given sample is contingent on the availability of the samples required to make the given type of link” from the FAQs on weighting linked samples.

Another area where the documentation could be clarified is to be clear what is meant by adjacent months (specifically calendar MONTH vs MISH). My first assumption for interpreting LNKFW1MWT was that for someone in MISH 4, this would be the weight for MISH 5. It clearly isn’t, as everyone receives a value of 0 or ., as for MISH = 8. Instead, it looks like the relevant weight for that comparison is LNKFWMIS45WT – and adding a note in the documentation might be helpful…

I think this could be further clarified by saying “two adjacent calendar months” rather than two adjacent months and being clear in the documentation about how to interpret the weight for those in MISH 4 and 8 (who are not eligible to link to the next calendar month, yet are included in the universe for this variable).

Likewise clarifying the universe statement for LNKFW1MWT to be specific to calendar month might also help: “Individuals interviewed in a calendar month that is linkable to the next calendar month.”

Also, it would be nice to have clarity on the meaning of zero.

The reason this matters to me is that I was hoping to use these weights to identify linkable samples for analysis, rather than simply to use the weights. So, having missing codes (.) identify unlinkable samples is really helpful to me and I hope you would keep this feature.

I’m a former IPUMS staffer, so I’m happy to follow up directly with Jose or others on CPS.

And thanks to the IPUMS-CPS staff … I was delighted to learn that these weight variables (hopefully) identify unlinkable samples.

Sheela

JeffBloem · May 18, 2020, 6:22pm

Thanks for these detailed notes. I will send this to the IPUMS CPS Team to use as a reference when they update the documentation for the longitudinal sampling weights. These longitudinal sampling weights were released relatively recently, so this feedback is very valuable.

Regarding identifying linkable samples: The IPUMS CPS Team recently held a Summer Workshop that focused on using the linking capabilities of the CPS data. It seems to me that some of the resources shared for this workshop (which is all available here) will be helpful to you.

Sheela_Kennedy · May 18, 2020, 11:21pm

Thanks for sharing the additional materials!

Clemens_Oberhuemer · July 5, 2021, 8:46am

Hello Jeff,

I am currently working with linked monthly CPS data and am using the variable PANLWT for my analysis in STATA.

Fist of all, I was wondering whether there already exists a documentation for values of zero (0) for both the variables PANLWT and LNKFW1MWT?
When I conduct analysis for adjacent months for a certain period of time (e.g., summarizing the fraction of retired people for different months-in-sample between 2010 and 2021) and use PANLWT ([aweight = panlwt]) after a command, I have no information for individuals in month-in-sample 1 and month-in-sample 5. Shouldn’t the value of PANLWT be 1 instead of 0 for those months? If not, how can I obtain information for those two mis? Can I use WTFINL even if the observation are pooled over a period of several years?

Also, I was in general wondering about the difference between PANLWT and LNKFW1MWT. As far as I am read in this forum, the former “uses time 2 weights”, while the latter “uses time 1 weights”. However, I am not sure how to interpret this information. Could you maybe explain how the weights are constructed and/or how to interpret them (especially the difference between time 1 and time 2 weights)? That would be of great help.

Finally, also a general question, for the case where I am working with data for adjacent months over a period of several years: Should I always the variables PANLWT or LNKFW1MWT? When I am for example just summarizing the fraction of indivuals in college or university for different months-in-sample pooled over a period of several years, could I also use WTFINL?

Thank you!

Clemens

Grace_Cooper · July 13, 2021, 9:48pm

A weight of zero occurs if a record is out of universe. Both PANLWT and LNKFW1MWT are non-zero only for records who have data in two consecutive months. In addition, PANLWT is non-zero for records who are able to be linked backward one month (e.g. people who were interviewed in January who were also interviewed in December), whereas LNKFW1MWT is non-zero for records who are able to be linked forward one month (e.g. people who were interviewed in January who were also interviewed in February). The CPS 4-8-4 Rotation Pattern limits who can be linked using either weight according to their month-in-sample (MIS). For instance, a person interviewed in January who is in MIS 1 will have a value of zero for PANLWT because they were not interviewed in December; they will, however, have a non-zero value for LNKFW1MWT because they were interviewed in February (assuming they were not skipped or missed that month). So, for some individuals PANLWT is available and LNKFW1MWT is not available and for other individuals the reverse is true.

The appropriate weight to use depends on the design of the research (e.g. whether the analysis involves pooling multiple basic monthly samples or linking samples across individual months). Analyses in which you are pooling basic monthly samples (e.g. aggregating samples from multiple months or years in order to increase sample size in a cross-sectional dataset) should utilize WTFINL in order to generate population estimates. When pooling multiple basic monthly samples, you will want to be aware of changes in the universe across samples and make sure to divide the weights by the number of samples being pooled. Analyses that require linking the CPS (e.g. tracking individual-level transitions across months in order to create a panel dataset) should utilize PANLWT or an IPUMS-constructed longitudinal weight, depending on the research design.

In hopes of clearing up some confusion about these weights, I will describe how they are different, which is causing the zero values you mentioned. PANLWT (created by the Bureau of Labor Statistics) weights persons based on the population at time 2, whereas LNKFW1MWT (created by IPUMS) weights observations based on the target population at time 1. Analyses of individuals interviewed in February linked to their January interview for a backward-looking question (e.g., what fraction of people in February transitioned from unemployment in the previous month?) should use PANLWT (because the target population is focused on February, time 2). In contrast, analyses of individuals interviewed in January who are linked to their February interview for a forward-looking question (e.g., what fraction of people who were unemployed in January became employed in February?) is focused on a target population in time 1 (January) and should be weighted with LNKFW1MWT.

For more information on longitudinal weights, see the powerpoint presentations and lab exercises from the 2018 IPUMS CPS Linking Workshop and this page on Linking and the CPS.

Devon_Yee · August 9, 2021, 11:52pm

Hi Grace,

I’m constructing an 8-month panel (January, February, March, and Aprill of 2020 and 2021) using CPS data. I matched on age, race, gender, and CPSIDP to link individuals across this time period. Would I use LNKFW8WT to weight my estimates? These weights are only available for January and February 2020 in the data I downloaded. Is it appropriate to apply the weights from January to the other months? I’ve been looking at the resources from this forum post, but wasn’t able to find specific guidance for longitudinal link weights for 8-month panels.

Thanks!
Devon

Grace_Cooper · August 17, 2021, 2:14am

LNKFW8WT is the correct longitudinal weight to use to link a full 8-month CPS panel; it takes the cohort of people who began in the CPS in the first month of your panel (in your case people who entered the CPS in January 2020 and continued through April 2020 and then in January through April 2021) and adjusts their January weight to account for attrition. Therefore, the only weight you need to apply for analyses using your 8-month panel linked dataset is LNKFW8WT for the first month of your panel (i.e. January 2020). For more information about linking the CPS, see this page Linking and the CPS as well as these materials from the 2018 IPUMS CPS Linking Workshop.

Clemens_Oberhuemer · October 4, 2021, 2:10pm

Thank you very much, Grace!

Topic		Replies	Views
Weights for linking CPS basic monthly data CPS	16	3296	April 22, 2020
Which longitudinal weights to use? CPS	1	808	October 30, 2018
Proper linking weight for observations in 4 consecutive months CPS	3	504	June 6, 2022
CPS Longitudinal Links CPS	1	396	June 14, 2019
Replicating lnkfw8wt in Stata CPS	1	257	May 5, 2021

Longitudinal link weights

Related topics