Reweighting after imputation


I am using a very basic method to impute employment status information for missing months in sample as done in the paper given below. I understand that the cps weights account for nonresponse. However, imputing these responses might overcorrect for nonresponders. The authors suggest a method for recalculating the final weight WTFINL. I am interested to calculate labor force transition rates with the help of the imputed data. I am not interested in calculating flows that add up to the official BLS stocks.
The following questions arise:

  • Are there any procedures you know of to reweight the data after imputation?
  • Are there resources to recalculate PANLWT after the imputation?

Any advice would be highly appreciated.

Literature: Bernhardt, Munro & Wolcott (2019): How Does the Dramatic Rise of CPS Nonresponse Impact Labor Market Indicators?

The general approach to reweighting in cases like this is to adjust the weights so that they sum to a predetermined population total. This is how the Census Bureau adjusts the CPS base weights to calculate the final weight (you can read the details of the CPS weight construction in CPS Technical Paper 77). This is done separately for several combinations of demographic groups, including state, age, race, and hispanic ethnicity.

The main difference I see with your application is that while usually the weights are adjusted upwards to adjust for nonresponse, in your case the weights will mostly be adjusted downward. This is basically to undo the nonresponse adjustment that was previously done (since you now have imputed data for a larger number of people).

IPUMS CPS has developed a method for creating longitudinal weights using iterative raking, which simultaneously ensures the weights sum to the correct total within several combinations of demographic variables. Details on the method including Stata code are available at the IPUMS CPS Linking Page. While this method is used to calculate forward-looking longitudinal weights which account for attrition in later survey waves, you should be able to adjust the code to produce an adjusted version of PANLWT, which accounts for the inflated sums of weights from your imputation.

I skimmed that paper you linked, and I see that they impute employment status for missing months in survey. I don’t see any discussion of how those households are assigned a weight (they don’t have a weight for missing months in the CPS microdata). Before adjusting the weights as I described above, you’ll need to decide how to assign a weight to those missing months (for example, averaging the household’s weight for the months that it was in the sample).

1 Like

Thank you very much for your detailed answer and all the effort.

I will dive into the material you provided me with.

I am sorry to resurrect this thread. I am struggling to replicate PANLWT, which I understand as another version of PWLGWGT, which is originally included in the CPS files.
As I read here on the forum, PANLWT uses time 2 population controls and weights. So I tried to use the final sampling weights WTFINL from period 2 rather than from period 1. I only considered individuals from period 1 that can be linked to period 2, for which I considered either all individuals or only those being able to be linked back (i.e. the same IDs from period 1).
Since the creation of the groups for the raking for LNKFW1MWT follows the CPS guideline for the second-stage ratio adjustment (TP 66,10-7), I did not change that. Despite, I wouldn’t have known what to take otherwise.
I have difficulties finding information on PANLWT. I understand the difference between this weight and LNKFW1MWT, I understand that PANLWT produces less inflated gross flows. But reading TP66, TP77 and other material provided by BLS I have no idea how PANLWT is created exactly.
My very intuitive first understanding of it bore no fruits.

I would very much appreciate any hint or push into the right direction.

The construction of PANLWT is detailed in the CPS Technical Paper 66, on pages 10-14 through 10-15 (pp. 85-86 of the pdf). They are referred to as “longitudinal weights.” It is a pretty simple adjustment: PANLWT “inflates all estimates or final weights by the ratio of the current month’s population controls to the sum of the second-stage weights for the current month in the matched cases by sex”.

What this means is that the weights in the subsample that matches between month t and month t-1 are inflated so that they sum to the population total (of civilians age 16+) in month t, for both men and women.

As a verification, I made the following tables in Stata, looking at the relationship between PANLWT and WTFINL in May 2018. As you can see, the sum of PANLWT for men is the same (within a tiny margin of error) as the sum of WTFINL for those men who were in-universe for PANLWT.


1 Like

Thank you, Matthew, thank you very much.
I have read this passage over and over again, yet I have no explanation why I failed to understand it. I guess I need to try harder to improve my English…
The way you phrased it was immediately clear to me.
Once again, thank you very much, and I am sorry to have bothered you with this silly question.

1 Like

Not silly at all! I’m glad I could help.