Hello, recently I have a question about weight variables in CPS. I’d appreciate if anyone could give me some helps.

My main question is how to use year-specific weights if I merge multiple years of CPS data, such as 1977 - 2009? In Stata, when I did regression, I simply adding personal or household weights without any adjustment. For an example, “regression hhincome age race i.state [aw = hwtsupp]” to apply weight.

But I also see that the sums of household weights between year to year are highly different. For example, the sum of 2009 is 1.173e+08, of 2001 is 1.083e+08, of 1995 is 0.99e+08 and so on. Is it still reasonable to use year-specific weight when I do regression on multiple-year NHTS data? Should we re-create a new weight variable?

Many thanks.

IPUMS CPS does not currently recommend a specific re-weighting process when pooling data across samples. One possible, and likely imperfect, method is to divide your sample weight by the number of samples you are combining. So, for example, if you are pooling 10 samples together you’d divide the sample weight by 10 in your pooled sample.

Personally, I’d think about what is gained from pooling annual samples together. There are many reasons why doing so is helpful (i.e. to achieve a sufficient sample size), but understand that these potential benefits come at a cost (i.e. sacrificing precision of weighted estimates).

I hope this helps.

The sum of weights is the estimate of the populatoin size. So in 1995, we had an estimated 99M households, in 2001, we had an estimated 108M households, and in 2009, 117M households. The country populatoin is growing… so I am not sure what else to expect of that sum of weights other than that it would be going up.

Multiple year weights are tricky, let’s just say that. See discussions elsewhere on this forum (what weights to use if I merge multiple years of ACS data?), as well as on ACS Data Users Community (https://acsdatacommunity.prb.org/sear…) – where you can find a couple of useful webinars.