Working with 100% samples


My main question is regarding working with the 100% samples. Specifically, I am hoping to extract the 100% samples for 1920, 1930, and 1940, with about eight variables. I see that each of these is about 5 GB. I am wondering how long an extract of that size generally takes to be ready? This is strictly for planning purposes. I have read elsewhere that files can take hours or even days to be ready but I was curious if a specific file size would have a more precise estimate.

Additionally, I am afraid that in my ignorance I initially requested some extracts that are way too big. I was wondering if IPUMS staff would be able to delete these, as I am concerned they might be using up server capability that would be better served on realistic requests.

Thanks very much!

The time it takes to create extracts depends on the size of your extract as well as how many others are requesting data and how large their extract requests are. The number of records rather than the number of variables tends to drive slower extract speeds on IPUMS USA. Your plan to request the full count files for decennial censuses in separate requests seems like a good approach for this. I would guess that a full count extract could take 1-2 days, but that can obviously vary as described above. Requesting fixed width rather than formatted extracts is one way to reduce extract wait times.

I have followed up with you directly via email about your enthusiastic first extracts–thanks for following up about this and keeping all IPUMS USA users in mind.