I’ve been using the basic walkthrough of using R to download NHGIS data. I’m trying to download 2010 census tract shapefiles, but, in the final step of actually performing the download using the provided download link, the download stops at 80-90%. R returns this warning message:
Warning message:
In download.file(des_df$download_links$gis_data, zip_file, headers = c(Authorization = my_key)) :
downloaded length 430096407 != reported length 503138568
and I’m unable to unzip the downloaded data either manually or using unzip()
. I’ve tried this multiple times over multiple days, and as far as I know there’s no way to subset the download into different states. Can someone provide some insight on whats happening here?
For additional context, here’s the rest of my code:
library(tidyverse)
library(sf)
library(httr)
library(jsonlite)
library(ipumsr)
# This is my key -- a new one can be obtained from the IPUMS website
my_key <- c("MYKEY")
url <- "https://api.ipums.org/extracts/?product=nhgis&version=v1"
# writing metadata for json to be extracted
mybody <-
'
{
"shapefiles": [
"us_tract_2010_tl2010"
],
"description": "2010 tract shapefiles",
"breakdown_and_data_type_layout": "single_file"
}
'
mybody_json <- fromJSON(mybody, simplifyVector = FALSE)
result <- POST(url, add_headers(Authorization = my_key), body = mybody_json, encode = "json", verbose())
res_df <- content(result, "parsed", simplifyDataFrame = TRUE)
my_number <- res_df$number
data_extract_status_res <- GET(paste0("https://api.ipums.org/extracts/", my_number, "?product=nhgis&version=v1"), add_headers(Authorization = my_key))
des_df <- content(data_extract_status_res, "parsed", simplifyDataFrame = TRUE)
des_df$download_links
# Download table data and read into a data frame
# Destination file
zip_file <- "NHGIS_2010tracts.zip"
# Download extract to destination file
download.file(des_df$download_links$gis_data, zip_file, headers=c(Authorization=my_key))
# List extract files in ZIP archive
unzip(zip_file, list=TRUE)
# Read 2000 block-group CSV file into a data frame
tracts2010 <- read_nhgis(zip_file, data_layer = contains("2000_blck_grp.csv"))