Hi all,

I am interested in calculating the mean average wage of people who were absent from work in the previous week. Ideally I would like to get this average in every month from 2017-2019. My plan is do to do something like this (SAS code):

proc tabulate data=mydata;
 where absent=3 and earnwt > 0;
 var earnweek;
 class year month;
 weight earnwt;
 table year*month, earnweek *(mean) / nocellmerge ;

I have a couple questions:

  1. Is this the correct way to limit to only people in the outgoing rotation group (checking EARNWT > 0) or is there something else I need to do?

  2. I have seen people talk about needing to divide EARNWT. Do I need to do that here? What if I decided to pool the data for each year, would I then need to divide EARNWT by 12?

Thank you so much!

In response to your questions:

  1. EARNWT>0 is not a good way of identifying individuals with earnings data. In fact, all civilians age 16+ in outgoing rotation groups have EARNWT>0, but only those who are currently employed (and not self employed) have data from the earner study. The variable ELIGORG should instead be used to identify those with earnings data. Note that this variable does not always line up perfectly with the stated universe for the earner study, but in 2017-2019 there are very few cases of mismatch.

  2. The only reason you would need to divide the weights by the number of pooled samples is if you wanted to estimate population totals (e.g. the number of employed people who were in a union in a given year). For calculating averages or proportions, you don’t need to divide the weights.

Thanks so much!

And just so I make sure I understand, ELIGORG = 1 is the same as (EMPSTAT in (10,11,12) and CLASSWKR in (20,21,22,23,24,25,26,27,28) and AGE >= 15 and MISH in (4,8)). Is that right?

In theory it should be, but they are not quite identical. That’s what I meant by "This variable does not always line up perfectly with the stated universe for the earner study. " In each sample there are a handful of cases that do not follow the pattern, and we don’t know why for sure. It seems that these people were erroneously asked the earner study questions. This interpretation is supported by the fact that these individuals have EARNWT=0.

One more note: in the ASEC samples (only) the universe includes 15 year olds. For all other samples (referred to as basic monthly samples or BMS), the universe excludes 15 year olds.