Hi! I’m using several iterations of the IPUMS USA for an RA project (2007 3 yr; 2010 3 yr; 2013 3 yr; 2018 5 yr; 2022 5 yr), and one of the variables were are trying to construct is the weighted proportion of coethnics with each PUMA for each respondent based on how they are categorized in the RACE variable. I’ve done the cleaning of each variable (RACE is currently a categorical variable), but I’m a little unclear about how to calculate these weighted proportions. I’ve followed the instructions on IPUMS User Notes for binary variables to get weighted means, but have gotten a little confused for this variable that has several categories. I’ve used the code below, but would appreciate any advice or clarification! Thank you!
data ← data %>%
group_by(MULTYEAR = haven::as_factor(MULTYEAR),
STATEICP = haven::as_factor(STATEICP),
PUMA = haven::as_factor(PUMA)) %>%
summarize(PropCoethnicPUMA = weighted.mean(RACE, PERWT, na.rm = TRUE))