Error in R data read

Hi all - new user of IPUMS here (and sorry if this is a repost).

I’m using IPUMS USA for a data project, and I’m encountering an error when trying to read my data into R. I’m using the following code (per the training exercises/other user posts)

rm(list=ls())
library(ipumsr)

ddi ← read_ipums_ddi(“usa_00008.dat”)
data ← read_ipums_micro(ddi)

And getting a the following error “Start tag expected, ‘<’ not found [4].” I’ve decompressed my data file from a dat.gz → .dat, and I’ve downloaded the data into my working directory. If anyone could point me in the right direction, that would be appreciated!

1 Like

Welcome to IPUMS! In order to read your data using R, ensure that the following are true:

  1. You have downloaded the DDI file as an XML file (right-click on the link titled ‘DDI’ > Save Link As)
  2. Both the .dat file (data file) and .xml file (ddi codebook) are saved in the same folder as your R project (your working directory)
  3. The ipumsr package has been installed and loaded (install.packages(‘ipumsr’) and library(ipumsr))

These instructions are outlined in more detail here.

Your code should look something like this (the main difference being that the .xml file should be called in the read_ipums_ddi function):

#Install and load ipumsr package
install.packages(‘ipumsr’)
library(ipumsr)

#Load the data
ddi ← read_ipums_ddi(“usa_00008.xml”)
data ← read_ipums_micro(ddi)

#View a summary of the data
summary(data)

1 Like

Hi there,

I used the exact code supplied with the data (same syntax as the chunk you provided) and am getting the following error:

Code entered:
ddi ← read_ipums_ddi(‘usa_00001.xml’)
data ← read_ipums_micro(ddi)

“Error: Can’t combine ..1$val < character > and ..2$val < double >.”

I have installed and loaded the ipumsr package. I have uploaded the .dat file, the R file, and the DDI (in the form of an .xml file) associated with my dataset to R and they are in the same folder, which is the working directory. And the filename used is correct.

Any guidance on this would be greatly appreciated.

1 Like

I am encountering the same issue, hoping someone could help.

1 Like

This issue has come up recently and can be fixed by reinstalling the development version of ipumsr from Github with the following commands:

> if (!require(remotes)) install.packages("remotes")
> remotes::install_github("mnpopcenter/ipumsr")

We will be releasing the development version to CRAN soon, so that users don’t need to install from GitHub to avoid this error in the future.

Hi Grace,

Thanks a lot for your help.
I am following the above steps and the software reports: “cannot open URL 'https://api.github.com/repos/mnpopcenter/ipumsr/commits/master”.
I checked with browser and it is fine to connect to github, so I have no idea why the error comes up.
Any guidance on this would be greatly appreciated.

The error you are encountering may be an issue with ‘remotes’, according to this Github page. Try using this code that utilizes ‘devtools’ instead, as outlined on the ipumsr tech page.

if (!require(devtools)) [install.packages]("devtools") 
devtools::install_github("mnpopcenter/ipumsr")
1 Like

Thanks, Grace!

I stopped receiving alerts for responses to my original post, so I wasn’t aware you had posted fixes. Second fix (plus a bunch of unloading/reloading of packages) finally worked for me.

This is not working for me. The file extract is a .dat, but the code in the accompanying command file for r for this extract references an .XML file. Therefore the file is not found when I run the command lines. When I change the file name to .dat then I get this error when I run the code:
Error in read_xml.character(ddi_file_load, data_layer = NULL) :
Start tag expected, ‘<’ not found [4]

I tried to edit the extract setting to .XML but I don’t see that as a data extract option.
I ran all the code in the previous comments from @Grace_Cooper.

The .dat file is automatically called through the .xml file when you run the following code (substituting ‘usa_00001.xml’ with the name of your extract):

ddi ← read_ipums_ddi(‘usa_00001.xml’)
data ← read_ipums_micro(ddi)

In addition, make sure you save the .xml file to the same directory as your .dat file.

IPUMS released a new version of ipumsr to CRAN that should have fixed the other issues mentioned on this thread. If after running the above code the problem persists, verify that you have the correct version of ipumsr (version 0.4.5) with the command 'packageVersion(“ipumsr”). Finally, if you are still having issues after trying these options, please provide a sample of the code you are using so that I can help you troubleshoot the issue.

I don’t see an .xml file anywhere. I can just see a .dat extract, a .r R code (2 lines of code) and .txt codebook file. Please advise where the .XML file is.

In order to download the .xml file associated with your extract, on the “My Data” page, right-click on the words “DDI” for your extract under “Codebook” and click “Save Link As”. Save the file in your working directory; it should automatically format it as an .xml file. See page 6 of this data exercise for more step-by-step instructions on how to download and use an extract.

Thank you very much. This worked. I didn’t realize that the DDI was an .xml file – I thought it was just a documentation file. My error. Thank you for this solution.

1 Like

Hi! I’m having a similar problem, but using idhs data. I have tried everything in the thread, but when I run:

ddi ← read_ipums_ddi(“idhs_00003.xml”)
data ← read_ipums_micro(ddi)

I get:
Error in read_xml.character(ddi_file_load, file_select = NULL) :
Opening and ending tag mismatch: meta line 13 and head [76]

I set my working directory to a folder with both the .dat file and the .xml file. I’ve been having trouble finding resources online specific to the idhs data sets, maybe this is part of the problem?

Hi Stephanie,

This error seems to be referring to a bad DDI file. As a first step, I recommend redownloading and replacing the DDI file, being sure to follow the instructions here. If that doesn’t work, I would resubmit the extract request by clicking on the Revise button next to the extract on the My Data page and submitting with no changes to the request. You can then redownload and replace the DDI and data files from the new copy of the extract. If neither of these solutions work, please follow up here on the forum or email us at ipums@umn.edu and we can help you troubleshoot further.