Dat.gz file not opening on mac

I am attempting to download full count linked census data from the years 1850 to 1900. When I try to open the zipped file on my computer (mac) I get Error 79 - Inappropriate file type and format.

I have downloaded a couple of alternative unarchiver apps. They appear to work better but only produce an unzipped file that is 140 bytes big. When I opened the file they produce in Stata, it appears that these 140 bytes only have data for one individual.

I am worried that the file I am downloading from IPUMS is somehow corrupted… Is this possible?

Hi @Sofia_Tennent! Although the linked full count files drop records that don’t link, they are typically still quite large files. My best guess is that your file didn’t download completely. I recommend trying to download the file again (over a wired ethernet connection if possible). Please contact us directly at ipums@umn.edu if the issue persists.

Hi,

I was also having this issue, with all of the extracts I tried and even on multiple downloads of the same extract. Specifically I was getting Archive Utility’s -79 error.

After doing some digging around online, I saw someone (maybe it was an earlier post on this board - I can’t remember) suggesting I try the third party software “The Unarchiver” (I won’t link to it here but you can find it in the App Store for free). This worked just fine on each of the files I tried it on, and was even faster than Archive Utility. I suggest giving it a try if performing multiple downloads does not fix the issue.

Hi @KariWilliams! Thanks for your response! I just tried to download the same data extract on a mac desktop using wired ethernet connection. I found the same problem–Error 79 appears each time I try to open the data zip file. I will reach out to the IPUMS support staff. Thanks again!

Hi @Scott_Nordstrom! Thanks for the suggestion. I tried using The Unarchiver and it did succeed in unzipping the file–but it only created a data set 140 bytes large. The data only appeared to account for one person. I think this is a download related issue. Thanks again!!

I should have explicitly suggested The Unarchiver earlier (thanks @Scott_Nordstrom!) – I incorrectly assumed you had tried this specific tool. I strongly suspect that was the issue.

Looking at the linked full count extracts in your account, I see that they all apply the case selection tool beyond the variables necessary to include only linked records over time. I am sharing a note from the summary documentation on linked census data extracts for your reference:

Users should be cautious about adding case selections beyond those that are applied automatically by the system to identify linked individuals across census years. Performing case selection on time-variant characteristics — such as age, marital status, or state of residence — risks excluding some observations for a person. An individual may be linked in a census year, but the observation will be dropped if they do not meet the additional selection criteria in that specific census .

Using your most recent extract (4) as an example, I see that you are linking all available decennials from 1850-1940 and further selecting cases based on STATEICP values that correspond to Texas. This means to be included in your extract the person must be alive in both 1850 and 1940, linked across all censuses between these years, and reside in Texas every time they are observed in the census. It doesn’t seem unreasonable to me that these stringent criteria only yield one record.

I hope this helps. Please follow up with any questions.