Hi all,
I would love some suggestions on aggregating CPS ASEC Data at Household Level, in particular on the usage of ASEC WEIGHTS.
As far as the aggregation is concerned, what I did so is:
-
for variables where the values were already aggregated and repeated for each member of the household (like HHINCOME, FOODSTMP, etc) I kept the observations for the householder (so RELATE ==101). Alternatively, I eliminated the duplicates, and the result is the same.
-
for the variables at the individual level (e.g. unemployment benefits, INCSSI, etc), I summed up the values of the members of each household.
Is it correct? (I think so, but a further confirmation is welcomed).
Most importantly, I am not sure on how to use the weights. I did the following:
-
I chose ASECWTH
-
I considered it as the # of people with the same characteristics of the corresponding respondent
-
Hence to compute the percentage of (for example) SNAP Recipients who are African American I did: SUM of the weights corresponding to household observations with African American and SNAP dummies = 1, divided by the sum of the weights corresponding to HH observations with SNAP dummy = 1.
Is it right? Or should I use it in another way (or use other weights)?
I apologise for the silly questions, but it’s my first experience as Research Assistant and I am not familiar with this dataset.
Thank you very much in advance for your help
In terms of your aggregation, yes, if you are interested in household level information, you will want to select just one observation per household. Since it looks like you are using Stata for your analysis, RELATE==101 or PERNUM==1 will limit your data to one observation per household.
As for weighting, your logic in how weights work is correct and the method you proposed for applying them is valid. Just be careful, because in the example you gave, it sounds like you are combining a household level variable (SNAP recipient) and a person level variable (RACE) but applying a household weight. A way to do this at the household level would be to specify households where the head of the household is African American, or households where at least one member is African American. Otherwise, you are actually conducting a person level analysis.
In Stata (and other statistical tools) there is a very simple way to apply weights. For example, at the end of your command in Stata, you can specify [pweight == ASECWTH] or for simple frequencies and tabulations you can use [fweight == ASECWTH] at the end of your command. Some basic information on selecting the appropriate weight syntax can be found here.
Additionally, since you mentioned you are relatively new as research assistant and unfamiliar with IPUMS data, it may be worth working through some of the exercises found here. The exercises can help you get familiar with some basic commands in the statistical package of your choice.
Thank you very much!
Don’t worry: I was talking about SNAP and RACE after having “aggregated” them at HH level (I chose RELATE ==101 also for RACE as the professor told me to consider the race of the householder).
I used R (I wrote it that way as the majority uses Stata). But I exported the dataset with the dummies also in .dta format so I could use the command you suggested me in Stata just to be 100% sure about the results.
Thank you again, you were very helpful!