I would like to use the IPUMS API to make a 1-year ACS query that yields the identical result of the Data Extractor.
For example, when selecting all variables in the Data Extractor for the 1-year ACS 2021 (2021_ACS1), what is the equivalent API query?
Similarly, when selecting all variables in the Data Extractor for the 5-year ACS 2017-2021, what is the equivalent API query?
The requirements of a query are listed here:
https://developer.ipums.org/docs/v2/workflows/create_extracts/nhgis_data/
Reading the documentation, I was able to find the dataset codes here:
https://www.nhgis.org/overview-nhgis-datasets#dsg82
However, it’s not clear what specifics are required to specify an export whose results are identical to the Data Extractor tool, found here:
https://usa.ipums.org/usa-action/variables/group
Thank you for your advice.
Dear Joseph,
The question you’ve asked deals with two different IPUMS products - NHGIS and USA, which provide different types of data. Thus, an API request to NHGIS will never return identical results of the USA data access system. The NHGIS dataset codes that you link to are not the same as the sample IDs used in IPUMS USA, so you can’t use the NHGIS dataset codes to create an extract request for IPUMS USA. Additionally, while we have an API to query the NHGIS metadata, we do not yet have an API to query all the IPUMS USA metadata.
IPUMS USA provides microdata, where every record is for an individual person (nested within a household) and every column is a variable describing some characteristic of the person. NHGIS provides summary data, where every record is for a geographic unit (e.g, state, county, city) and every column is a tabulation (or cross-tabulation) describing the aggregate characteristic of people residing in the geographic unit. I will provide a bit more detail about both IPUMS USA and IPUMS NHGIS below.
NHGIS starts with the summary data published by the Census Bureau, restructures into our record layout and format, and then provides access to it via our data access system. NHGIS does not tabulate microdata from IPUMS USA. USA starts with microdata published by the Census Bureau, restructures it into our record layout, harmonizes codes, and adds additional geographic identifiers. Because of the different source data and data formats, you cannot use metadata or an API call from one data collection to submit a comparable request to the other.
If you’re an R user, we recently released a new version of our ipumsr package that supports programmatic interaction with the API through R functions. We have some functions in that package that can help you out. For example, you can use the get_sample_info(collection = "usa")
to get the sample names for all IPUMS USA samples. That would be a starting point for generating an IPUMS USA extract.
If you build an extract request in ipumsr, you can then write it out to a JSON file using the save_extract_as_json()
function. You could then use this JSON file as the building block to create other extract requests.
Sincerely,
Dave Van Riper
IPUMS NHGIS