# Looking at variance of hours worked for months in CPS sample

I am looking to analyze the variance in the hours people reported they worked last week AHRSWORKT across the 8 months they are included in the CPS sample.

I am thinking of calculating the variance in AHRSWORKT conditional on AHRSWORKT being positive in all 8 of their sample interviews.

I was also thinking of looking at AHRSWORKT/ UHRSWORKT

Do either of these make sense?

I am also wondering a little bit about what weights to use as I don’t quite understand the difference between LNKFW8WT and PANLWT.

I might be misunderstanding what you are trying to do, so let me know if I am mistaken. It sounds like you are looking to calculate summary statistics of hours worked for people across the 8 months they are included in the CPS sample. If so, an easy way to do this is to re-shape your data to wide format. (You can check out some of available resources on using CPS data as a panel on this page.) So that hours worked last week (AHRSWORKT) in each survey month is on the same line for each individual in your sample. Then you can calculate new variables representing the mean and standard deviation (the square root of the variance) of hours worked last week for each individual in your sample. You can then summarize these variables for all individuals in your sample.

Depending on your ultimate goal, I think conditioning all of this on individuals reporting positive AHRSWORKT values in all interviews seems reasonable. Although, you should remain cautious of selection bias driven by this conditioning criteria. There’s a lot of month-to-month attrition in CPS, so perhaps run your analysis with this conditioning criteria and without to see how the results change. Additionally, the key difference between AHRSWORKT and UHRSWORKT is that the former specifically asks about hours worked last week while the later asks about usual hours worked during an unspecified time frame.

Finally, regarding the sampling weights. From what you’ve described about your calculation, you should be using the LNKFW8WT variable. This variable is designed for calculations when using the full 8-month panel. the PANLWT variable is more specifically designed for linking individuals in two adjacent samples.

Thank you!

The idea is to calculate the standard deviation of hours worked for each individual and use this standard deviation as the dependent variable in a regression with state minimum wage levels as the main independent variable.

Thank you for the clarification on the weights as well.