Hi,

I am trying to use the linking functionality of IPUMS’ CPS data, and I’m having a bit of trouble determining the correct weight to use. Essentially, I am linking observations for four consecutive months (March 2020-June 2020) in order to follow participants through each of the four months to see how their labor force status changes. As such, I have merged the monthly data using the “cpsidp” variable to generate a wide-form data set for the 1/4 of the sample that is present in each of the fourth months (i.e., the MISH1/MISH5 in March 2020) where each row represents a person and all variables are named x_1, x_2, x_3, and x_4.

However, after successfully linking the observations, I am struggling to figure out which weight to use. I am familiar with the “lnkfw1mwt” and “panlwt” options; however, two issues arise. First, when I use either of those in a collapse function, I end up with estimates that are about 1/4 of the actual numbers. For example, I get about 550,000 people as unemployed on temporary layoff in March, even though the actual number is ~1.8 million. And second, since I have multiple observations for each (e.g., lnkfw1mwt_1, lnkfw1mwt_2 and lnkfw1mwt_3), so I’m not sure which to use.

On the first problem, my hunch is that since these weights (lnkfw1mwt and panlwt) are for linking two consecutive months, they are not large enough to make the sample representative of the population (since my sample is 1/4, not 1/2, as large as a given month). I had an idea to just multiply the weights by 4, but that seemed haphazard so I thought I’d ask the forum instead.

Any help would be great!

Thanks,

Jimmy