I am new to IPUMS. I have downloaded the data and put it in MySQL just fine. I have have read all kinds of things that say “apply weighting” to the data. And I have the person and household weighting fields.

But I can’t seem to find anything that tells me how to apply weighting, unless I have Strata or SAS or SPSS. Those aren’t in the budget.

Can it be done with basic data software, MySQL, Access, Excel? And if so, how?

Thank you

1 Like

If you are calculating averages or frequencies, then it is possible to utilize the information contained in the weight variables without having statistical software. The person weights report the number of people in the total population that a single respondent represents. In other words, a respondent with a PERWT value of 10 has 9 other people in the sampling area that are identical to her based on certain demographic characteristics. If you are trying to project your sample to the total population, then this respondent should technically appear in your dataset 10 times. In practice, you do not want to actually replicate each observation by its value of PERWT, as you would likely end up with a dataset having hundreds of millions of observations.

For calculating weighted averages, you will want to multiply your variable of interest by PERWT and divide by the total weighted population (i.e. the sum of PERWT). For example, if you wanted to estimate the average INCTOT for the general population, you would multiply each applicable respondent’s INCTOT value by her PERWT value, sum this product over all applicable respondents, and then divide this sum by the total sum of PERWT for these respondents. For projecting sample counts to population frequencies, you will want to sum the PERWT value, rather than simply counting the number of respondents. For example, if you wanted to know the total number of males in Minnesota, you would sum the PERWT value for all male respondents that reside in MN.

The same logic presented above can be used at the household-level with HHWT, which reports the number of households in the total population that a single respondent household represents. When doing household-level analyses, you will keep only one observation per household (i.e. PERNUM=1).

Hope this helps.