Good Afternoon
I’m having some data from 1880 and i’m trying to get counts of interracial marriages.
I feel like I need a Weight lesson for dummies. In order to get a count from the sample of the full population what would I do? Use the House Weight or the Person weight? And what number would I multiply that by?
Thank you very much.
Laura
We have several resources that may be helpful to you. In particular, this blog post from 2017 motivates the use of sampling weights and links to a variety of IPUMS project-specific resources on applying sampling weights. In IPUMS USA data, the sampling weight represents how many individuals in the full population a specific individual (or household) in the sample represents. Specifically for your case, it sounds like you should be using the person-level weights. In IPUMS USA this is the PERWT variable. The key reason for this is that the statistic you are aiming to calculate is a person-level characteristic (e.g., marriages). If, on the other hand, you intend to characterize households by the marital status of the reference person (or something similar), then you would want to use the household-level weights (e.g., HHWT).
I saw that last night too. Thank you for it. I’m just unsure of the math involved. Would
I turn the PERWT into a decimal? Then multiply it by something? If so what?
How to specifically apply sampling weights to your calculations will differ based on the software you are using to run your analysis. Most statistical software will have built-in tools to automatically apply the sampling weights to any statistical analysis. For example, here is a nice discussion about applying sampling weights in Stata. It sounds like you might be using Excel, however. In this case, this page discusses how to apply sampling weights in Excel.
Hello…I’m using R and I’ve contacted a stats friend about this too. I did find this Using tidyverse tools with Pew Research Center survey data in R | by Nick Hatley | Pew Research Center: Decoded | Medium and could adding the weights really be the answer to getting a population total? But when I did that, the data differed.
In general, we do not expect to exactly identify “true” population counts with sample data. This is because applying sampling weights to sample data allows for an estimate of the total population. We do, however, expect to get relatively close. Your estimates for 1880 and 1870 look okay to me, however, the discrepancy between the estimate and the official Census Count from 1860 and 1850 does seem relatively large. One reason for this could be that the 1850 and 1860 sample data in IPUMS USA are only random samples of the free population in those years. Therefore, if the official Census Count you use here includes the full population, then this detail could explain this discrepancy.