I am interested in getting estimates of the time-use for different occupations in terms of R&D. One approach is to use the surveys to compute the ratio of employees that allocate the majority of their time to R&D. However, I am unsure how to incorporate the weights to compute those ratios. For those purposes, should I use the weights? Which variable? How? Should those be treated as frequency weights? Happy to provide additional context if it helps.
Which weight to use depends on whether you are analyzing SESTAT samples only (in which case you should use WEIGHT) or also including non-SESTAT samples (which should use WTSURVY). These should be treated as sampling weights (aka inverse probability weights).
I recommend reading this blog post for an overview of sample weighting. Any statistical package (such as Stata or SPSS) will allow you to use weights in most statistical procedures.
Here are some resources on weighting in the most common statistical packages:
R: survey and srvyr packages
Stata: add [pw=weightvar] to the end of your command. See the Stata User’s Guide (sections 11.1.6 and 20.24)
SAS: most procedures support a WEIGHT statement
SPSS: weighting in SPSS
Thanks @Matthew_Bombyk. In my case, I am computing the ratio by occupation R&D as primary role. In that case, would the weights be appropriate if computing the statistic per occupation rather than any statistic across those? I eventually pool the samples across surveys and years still by occupation. Not particularly looking at a precise estimate but being imprecisely correct. For example, the occupations I found that had a majority (at least 50%) using the unweighted method described previously were the occupations with “scientist” in their names and economists. Those mapped to about 41 occupations in the BLS OEWS occupations which was very closed to some manually constructed lists. Would appreciate any feedback on the appropriateness of the strategy or recommendation on how / when to incorporate the weights.