Thanks for the follow-up message and clarification. It sounds like you are trying to download IPUMS microdata via the IPUMS API and running into issues. Please correct me if I am wrong and you are also trying to access summary file data via the API or also ran into issues using the traditional web interface. Note that we welcome feedback on the API via email (email@example.com) or via the API topic on the User Forum.
If there are specific errors or issues in the API examples and documentation, please let us know about them so we can correct them. Without further detail about the examples or documents that aren’t working, I can share a few general comments that are hopefully useful to you.
First, we provide native-client libraries for R and Python users interested in working with the IPUMS API in those languages. These native-client libraries are designed to streamline the process of submitting API requests without getting into all of the details of the API interface (i.e., just name the variables and samples of interest and submit), bypassing the need to create a JSON definition directly (though these libraries also include functionality that allows users to export the resulting JSON definition). We also provide JSON versions of extract requests submitted via the web interface upon direct request; please contact firstname.lastname@example.org with the data collection and extract number for which you would like a JSON file if you are interested.
Second, the IPUMS API is relatively new and is currently limited to providing data access for household-person microdata (currently available for IPUMS USA, IPUMS CPS, and IPUMS International) as well as data and metadata access to summary data (currently available for IPUMS NHGIS). I am sharing a full list of supported and unsupported features. We are aware that metadata access via API for the microdata collections will be valuable to users; in the meantime, on the household-person microdata page of our developer portal we provide suggestions for accessing key metadata.
Third, note that IPUMS provides data from a variety of sources and in different formats through our User Interface. These are unique data collections and while they share an underlying metadata structure, the user-facing metadata between the different collections is not designed to be used in conjunction.
Finally, it sounds like you were able to make headway in determining the necessary storage requirements using the 1-year ACS as a reference. As you benchmark storage needs against the ACS PUMS, I will note that our data structures and the sizes of the datasets are very diverse–making back-of-the-envelope calculations about size of the entire IPUMS data collections difficult. Our microdata database contains 2.5 billion records across more than 2,500 datasets and our summary file data contain nearly 500 billion data cells. Some microdata files will be higher density than the ACS (full count files, some IPUMS International samples), while others will be lower-density samples but include more variables (CPS, NHIS, MEPS, Time Use, DHS, PMA). IPUMS NHGIS includes many millions of geographic identifiers describing geographic areas throughout the U.S. at dozens of levels, going down to individual blocks.
I hope this helps. Please follow up with questions.