Hello-For teaching purposes, I would like to download random samples of roughly 1000 observations from about 50-100 US cities, with about 20 basic demographic and economic variables such as sex, age, race/ethnicity, household size, employment status, income, education. Would IPUMS USA be an appropriate source for this, and if yes, could someone point me to the process for downloading data for a specific city (or perhaps I could download all cities at once, then sort, which would mean starting with 50K to 100K observations…). Help!
IPUMS USA would certainly do the trick. This video tutorial provides a basic overview of how to create a custom dataset with IPUMS USA. You can leverage the “Select cases” tool in the final stages of submitting your data extract request to keep only cases with certain values (e.g., certain metropolitan areas) for the variables that you have included in your extract (e.g., CITY, METRO, or MET2013).
Thank you for getting back to me. I’ve been working on the above- downloading random samples of roughly 1000 observations from about 50-100 US cities, with about 20 basic demographic and economic variables such as sex, age, race/ethnicity, household size, employment status, income, education. Because it is for teaching purposes-an introductory course- I would like to avoid weighting, but still include economic variables such as income, education and employment. The only samples with these variables seem to be the weighted ones. Any suggestions? Is there a method for transforming a weighted sample into a pseudo-unweighted sample? Or a fairly recent unweighted sample that with a more extensive set of variables than the 2010 10% sample?
The sample design of the ACS requires the use of sample weights for producing accurate estimates. I can understand that weighting might be outside of the realm of an introductory course; I recommend either skipping weights in analysis (possibly covering the concept briefly and letting students know that the results they get are not necessarily accurate) and/or using the 2010 10% sample for a subset of analyses where you don’t require the more detailed topical coverage of the ACS/long-form. The ACS Data Users Group (and corresponding forum) may be able to offer suggestions about transforming the PUMS as you describe, but I am not familiar with methods that allow for this.