When cross-checking my data with BLS’ CPS national self-employment data, my numbers line up pretty closely. National self-employment data found on page 12 of 16 at: https://www.bls.gov/opub/mlr/2010/09/art2full.pdf
Unfortunately, when I try to crosscheck my state-level numbers with ACS data, the only available data at the state level (ACS table K202402 provides yearly unincorporated self-employment averages), my annual average counts for unincorporated SE workers are off by a much greater proportion.
Could this be due to an ACS self-employment definition being different than that of CPS? Or maybe the ACS data is seasonally adjusted compared to my numbers which are NSA?
My 12-month pool of data has a sample size of 711, so I gather that it is large enough. Here’s a sample of my R code:
#Gathering Yearly Avg of Unincorporated Self-Employment Count in GA
GASelfEmployment2019_Avg ← GAyear2019%>%
filter(empstat >= 10, classwkr == 13, month >= 1,
month <= 12, wkstat >= 11, wkstat <= 41)%>%
summarize(unincoporatedSEworkers = sum(wtfinl))%>%
mutate(SE2019avg = unincoporatedSEworkers/12)
Also, if I’m looking to find a disaggregated self-employment rate by quarter (merging a quarterly self-employment count w/ a quarterly all-employment count), would I be on the right track with this? See R code below:
#Gathering disaggregated counts for unincorporated self employed workers for 2017 Q1
GASelfEmployment2017_Q1 ← GAyear2017%>%
filter(month >= 1, month <= 3,
empstat >= 10, classwkr == 13,
wkstat >= 11, wkstat <= 41)%>%
group_by(new_race = haven::as_factor(new_race), sex = haven::as_factor(sex))%>%
summarize(unincorporatedSEworkers2017Q1 = sum(wtfinl))
#Gathering disaggregated counts for ALL employed workers for 2017 Q1
GAALLEmployed2017_Q1 ← GAyear2017%>%
filter(month >= 1, month <= 3,
empstat >= 10, wkstat >= 11, wkstat <= 41)%>%
group_by(new_race = haven::as_factor(new_race), sex = haven::as_factor(sex))%>%
summarize(AllEmployedWorkers2017Q1 = sum(wtfinl))
#Merging created data sets of self-employed and All workers to calculate
#self-employment rate for 2017 Q1
SErateStat2017_Q1 ← full_join(GASelfEmployment2017_Q1,GAALLEmployed2017_Q1)%>%
mutate(SErate = unincorporatedSEworkers2017Q1/AllEmployedWorkers2017Q1)