R error when I try to filter data and apply sample weights

I get the following error:

Error in proxy[i, …, drop = FALSE] : incorrect number of dimensions

Please help. I can’t seem to figure out where I’m going wrong. Is my For Loop wrong? Or is my created variable named “output” being done incorrectly? Everything seems right until I get to my For Loop. Here’s my code:

ddi ← read_ipums_ddi(“cps_00026.xml”)
data ← read_ipums_micro(ddi)
data2 ← data%>%
rename_all(tolower)

data3 ← data2%>%
mutate(nh_white = if_else(race==100 & hispan==0, 1, 0),
nh_black = if_else(race==200 & hispan==0, 1, 0),
nh_api = if_else(race %in% c(651, 652) & hispan==0, 1, 0),
nh_other = if_else(!race %in% c(100, 200, 651, 652) & hispan==0, 1, 0),
latinx = if_else(hispan > 0 & hispan < 900, 1, 0),
wtfinl_six_mo = wtfinl/6
)
years ← c(2014,2015,2016,2017,2018,2019)
output ← vector(“double”, length(years))
for (i in seq_along(years)) {
data_filtered ← data3 %>%
#Filter for all Georgia entries where the source of public assistance income
#is not coded as not in Universe (0)
filter(srcwelfr == 1 & srcwelfr == 2 & srcwelfr == 3 &
year==years[i] & statefip == 13)
#Apply proper weights
table1 ← wtd.table(data_filtered$srcwelfr, weights = data_filtered$asecwth)
table1

Turn that table into a dataframe so that we can work with it

srcwelfr_year ← as.data.frame(table1)
srcwelfr_year ← reshape(srcwelfr_year, idvar = “status”, timevar = “Var1”, direction = “wide”)
print(srcwelfr_year)

I recreated your code and was able to run the for loop without error, though when I tried to execute creation of the table using the data frame created by the for loop, the error came up. The error message “Error in proxy[i, …,drop = FALSE] : incorrect number of dimensions” is coming up because the for loop you have written results in an empty dataframe. I can’t tell completely what your end goal is based on your code, but I think it may be unnecessary to use a for loop to get your desired result. For sample code and examples of using IPUMS data in R, you may be interested in these exercises. I don’t know what is appropriate for your specific analytical approach, but think you could leverage group_by() and summarize() to accomplish what you are doing in the loop you shared in your initial post. Also, keep in mind that if you are summarizing person-level data (such as SRCWELFR), you need to use the person-level weight, ASECWT.

Much thanks for your response. Regarding the not so necessary need for a For Loop, you’re right. After much trial and error, I changed my analysis. Here’s a snippet of it, using group_by and summarize instead. (As I’m still working on gaining more savvy with R, at some point I must learn how to loop this code across multiple years, so I don’t have to manually write out this method of analysis for each year that I’m researching.)

ddi ← read_ipums_ddi(“cps_00028.xml”)
data ← read_ipums_micro(ddi)
data2 ← data%>%
rename_all(tolower)
data3 ← data2%>%
mutate(nh_white = if_else(race==100 & hispan==0, 1, 0),
nh_black = if_else(race==200 & hispan==0, 1, 0),
nh_api = if_else(race %in% c(651, 652) & hispan==0, 1, 0),
nh_other = if_else(!race %in% c(100, 200, 651, 652) & hispan==0, 1, 0),
latinx = if_else(hispan > 0 & hispan < 900, 1, 0),
wtfinl_six_mo = wtfinl/6
)
##############################################################################

#DOING 2014 ANALYSIS
year2014 ← data3%>%
filter(year==2014)

#2014 Question1: By race, ethnicity and gender, what are the counts and
#proportions of TANF recipients in GA with an SPM unit status
#ABOVE poverty?
year2014analysis1 ← year2014%>%
filter(srcwelfr == 1, spmpov == 0, statefip==13)%>%
group_by(hispan = haven::as_factor(hispan), race = haven::as_factor(race),
sex = haven::as_factor(sex))%>%
summarize(n = sum(asecwt))%>%
mutate(pct = n/sum(n))

1 Like