pooling ipums USA and ipums CPS


I would like to pool ipums CPS and ipums USA to have values for years between census. Should the weights be reweighted, given that the sample size is not the same?


First, you should be using IPUMS-CPS data only for years between the decennial censuses. In other words, you should not be combining IPUMS-CPS data with IPUMS-USA data from the same year.

Second, you should be using the IPUMS-USA weight variables with the IPUMS-USA data and the IPUMS-CPS weight variables with the IPUMS-CPS data. Both data sources are individually nationally representative; thus, the difference in unweighted sample size does not matter as long as you apply the correct weight.

Third, the weighted estimates of total population will likely differ between sources, as the methods for estimating population differ and the original estimates undergo revisions. Since the IPUMS data provides accurate distributions, it is preferable to compare percentages over time rather than totals (e.g. percentage of females vs. total number of females). If totals are important for your analysis, you might consider using a single source for your population estimates. For example, the Census Bureau’s Population Estimates Program provides the official historical population estimates for several levels of geography. You could standardize the total populations in each year of your pooled USA-CPS dataset to these official estimates. This will control for changes in population that are due to using multiple data sources.

Hope this helps.