I am going to be doing a series of analyses on IPUMS CPS data. Whether as papers or blog posts, these will include R code to replicate them exactly. I would like to be able to supply an easy method for the user to exactly replicate the data sets on which I ran that code, without the user needing to hand-select all the variables and categories of variable involved. Do you have a way that I can do that? If so, can you point me to where it is documented? Through your API maybe?
My (somewhat grandiose) goal for the package I am writing to distribute that code is to increase the number of people using IPUMS CPS data to answer questions about causes, consequences, and trends in economic inequality by two orders of magnitude. In particular, I want to make it straightforward for undergraduates and the research staff of public interest advocacy organizations to tweak the code I provide to answer such questions for their own states and perhaps some of the larger counties or MSAs.
I know IPUMS tracks publications based on IPUMS CPS data. I have not been able to find that information on your website, but I am sure it is in pretty much every one of your grant applications. If it is on your website, could you point me toward it? I would like to see not only he most recent data, but also a time series showing how it has grown.
You people provide a terrific public service. I hope you – all of – you – know that.
Yours, Andrew Hoerner
Hi Andrew,
Thank you for the kind words. We are very excited to hear about your plan to provide an easy method for users to exactly replicate your analyses of IPUMS CPS data! This sounds like an ideal application of our API, which allows researchers to share R code with the specifications for a data extract (samples, variables, and other options) that other users can run themselves to download and input the data directly in R. Our ipumsr documentation website provides a vignette with sample code for submitting API data requests using the ipumsr package. Check out the define_extract_micro() function in particular. The API provides most of the same functionality as the online extract system and we are happy to answer any questions that you have about the API. The only portion that cannot be automated is that each person looking to use your provided replication code will still need to spend a couple minutes creating an account on IPUMS CPS and obtaining their unique API key.
While I don’t have a time series of how IPUMS has grown, our 2024 IPUMS By the Numbers page should give you a sense of how we’ve grown over the past year. We do our best to keep track of publications using IPUMS data by recording them in our IPUMS Bibliography tool. You can use the tool to search by data collection, topic, or any of the other available filters. Once your papers or blog posts have been released, we would appreciate it if you could add them to the bibliography as well!