Weighting for State-Industry Level Estimates

Hi All,

I would like to construct state-industry level measures of employment, similar to what Paul Goldsmith-Pinkham, Isaac Sorkin and Henry Swift do in their replication code below:

collapse (sum) indwt = perwt_cz (firstnm) geo' indind_digits’ if age>=18 & full_time==1, by(geo'_indind_digits’ year)

What I am unsure about is what the weight should look like. They construct it as a combined weight of the standard person-level weight and “afactor”, which is simply measuring the share of a given “puma” population in a commuting zone, where the latter is the geographical unit that you are interested in.

rename afactor cz_wt
label var cz_wt “Commuting Zone Weight”
gen perwt_cz = perwt * cz_wt
label var cz_wt “Person * Commuting Zone Weight”

Will I need to construct a comparable “State Weight”?

Thanks for your help!

The problem these researchers appear to be trying to solve is that the data does not report what commuting zone a sample household resided in and that the smallest geographic unit that is identified, public use microdata areas (PUMA), are not contiguous with commuting zones. The code you shared therefore appears to estimate commuting zones as weighted combinations of the PUMAs they contain. However, this is not an issue when constructing state-level measures since PUMAs do not cross state boundaries. The state that a respondent household resides in is identified in the variable STATEFIP.