What is the difference between a 0 and a missing code for the longitudinal weights? Does missing indicate not able to link (an unlinkable sample) and a 0 indicate that a link is theoretically possible but no link is made?

When someone in MISH 8 who receives a code of 0 for lnkfw1mwt – does this indicates but that links between that MONTH and YEAR are possible with the MONTH/YEAR of the subsequent MISH (+1 month or +8 months depending on MIS), even though this particular respondent isn’t eligible to link.


The difference between a longitudinal weight value of zero and a missing code is largely conceptual. In practice, observations associated with both of these codes will effectively be omitted from weighted analysis. As is discussed in the WTFINL variable description, the raw sampling weights in the basic monthly samples are “based on the inverse probability of selection into the sample and adjustments for the following factors: failure to obtain an interview; sampling within large sample units; adjustments to the known distribution of the entire population according to stage, age, sex, race, and Hispanic ethnicity; and allotting a weight of zero to populations not sampled in other monthly surveys (i.e., persons in the Hispanic oversample and members of the armed forces in ASEC samples).” Missing codes, on the other hand are not able to be linked for a variety of reasons, including: (i) that the individual should have responded to the questionnaire in the subsequent month or (ii) the individual is in the an incoming or outgoing rotation group (e.g., MISH==1,4,5, or 8).

Hi Jeff,

Thanks for your reply. I’m talking specifically about the longitudinally linking weights, e.g. LNKFW1MWT.

The documentation on IPUMS defines the universe for this variable “Those individuals who are eligible to link to the next month.” The vast majority (97% from 1976-present) of people who are in MIS 8 (and therefore not eligible to link to the next month due to survey design) receive a code of zero (0), rather than missing (.). If the universe statement was correct, all of these individuals would have received a missing code (.). Further, the documentation doesn’t define values of zero (0).

The remaining 3% receive a missing code. The missing codes, from looking at the data, appear to be concentrated in “unlinkable samples”, which is consistent with the statement “Users should note that availability of these variables in a given sample is contingent on the availability of the samples required to make the given type of link” from the FAQs on weighting linked samples.

Another area where the documentation could be clarified is to be clear what is meant by adjacent months (specifically calendar MONTH vs MISH). My first assumption for interpreting LNKFW1MWT was that for someone in MISH 4, this would be the weight for MISH 5. It clearly isn’t, as everyone receives a value of 0 or ., as for MISH = 8. Instead, it looks like the relevant weight for that comparison is LNKFWMIS45WT – and adding a note in the documentation might be helpful…

I think this could be further clarified by saying “two adjacent calendar months” rather than two adjacent months and being clear in the documentation about how to interpret the weight for those in MISH 4 and 8 (who are not eligible to link to the next calendar month, yet are included in the universe for this variable).

Likewise clarifying the universe statement for LNKFW1MWT to be specific to calendar month might also help: “Individuals interviewed in a calendar month that is linkable to the next calendar month.”

Also, it would be nice to have clarity on the meaning of zero.

The reason this matters to me is that I was hoping to use these weights to identify linkable samples for analysis, rather than simply to use the weights. So, having missing codes (.) identify unlinkable samples is really helpful to me and I hope you would keep this feature.

I’m a former IPUMS staffer, so I’m happy to follow up directly with Jose or others on CPS.

And thanks to the IPUMS-CPS staff … I was delighted to learn that these weight variables (hopefully) identify unlinkable samples.


Thanks for these detailed notes. I will send this to the IPUMS CPS Team to use as a reference when they update the documentation for the longitudinal sampling weights. These longitudinal sampling weights were released relatively recently, so this feedback is very valuable.

Regarding identifying linkable samples: The IPUMS CPS Team recently held a Summer Workshop that focused on using the linking capabilities of the CPS data. It seems to me that some of the resources shared for this workshop (which is all available here) will be helpful to you.

Thanks for sharing the additional materials!