I am using the 2017 5 year ACS to calculate the number of 23 year olds with a BA.
I am getting that there were only 200,000 which is way to low! Am I doing something wrong.
number_youngBA_2016 ← acs %>%
filter(age == 23) %>%
filter(year == 2017) %>%
filter(multyear == 2016) %>%
filter(educ >=10) %>%
select(perwt)%>%
summarise(total_23_year_oldBAs = sum(perwt, na.rm = TRUE))
print(number_youngBA_2016)
A tibble: 1 × 1
total_23_year_oldBAs
1 222359
OH. I am using the weights wrong. you can’t take a single year from a 5-year sample, and then use the weights.
You need to grab 5 individual 1 year samples…
That is correct. You cannot estimate 1-year parameters using a 5-year file. To create accurate, weighted estimates of frequencies for one year of data, you need to use the ACS 1-year samples. The weight PERWT adjusts for pooling in the 5-year sample. If you ultimately want to use the 5-year file for your analysis and estimate a frequency for just one year (e.g., as an interim check), you can multiply the weight PERWT by 5. Otherwise, I would recommend using PERWT as is in the 1-year file. Note that to estimate empirically derived standard errors or confidence intervals using ACS data, you need to use replicate weights.