XML tag mismatch error in R

Hi, first time IPUMS user here! I am required to use data from this source for a “data test” related to an internship application, so I’m a little nervous about time here - I have to hand this assignment in three days from now. Hopefully I’m just doing something silly and this will be a quick fix!

I’m getting the following error from R after running the provided code:
Error in read_xml.character(ddi_file_load, data_layer = NULL) :
Opening and ending tag mismatch: meta line 12 and head [76]

Here is the exact code I’ve run:
if (!require(“ipumsr”)) stop(“Reading IPUMS data into R requires the ipumsr package. It can be installed using the following command: install.packages(‘ipumsr’)”)
library(ipumsr)
ddi ← read_ipums_ddi(“usa_00001.xml”)

I found a similar issue from a few years ago in this post: https://forum.ipums.org/t/error-in-r-data-read/3573/10

So I added the following code to my .r file, per the solution listed there, even though the latest version of ipumsr should have corrected for it:
if (!require(remotes)) install.packages(“remotes”)
remotes::install_github(“mnpopcenter/ipumsr”)
library(remotes)

I am still getting the same error. I’ve downloaded the .xml file and the .dat.gz file, and have tried this both leaving the file zipped and unzipping it. All of the files are in the same directory as the .r script, so I’m at a loss for what to try next. I’ve double-checked the file names as well to make sure there wasn’t a typo anywhere.

The following portion of the error message you are getting, which refers to an html error in the DDI, makes me think your xml file may have downloaded incorrectly: Opening and ending tag mismatch: meta line 12 and head [76]. It sounds like you have done everything right so far. I would try re-downloading both the DDI and data files and running it again. If that doesn’t work, resubmit your extract and try again. I hope that helps!

Thank you so much, I’ll try that right now! I did have to use download.file() in R rather than right-clicking and using “Save Link As” since Chrome wouldn’t download the file, so maybe that created a problem somehow.

In that case, if this fix doesn’t work, I would try downloading the DDI directly from the hyperlink using a different web browser such as Internet Explorer. Good luck!

Thanks - that’s exactly what I did this time around. I redownloaded both the data and the DDI from the hyperlink using Safari, and got the same error message. I’m thinking of downloading a dataset with the minimum number of columns just as a test, but ultimately I will need the other columns in this dataset in order to do the assignment. Is there something else I should try?

Hi Allison! Sorry if this is what you’ve already tried, but note that if you are in Safari, when you right click on the DDI link, you want to choose “Download Linked File” and not “Download Linked File As”. This process is described in the Introduction to ipumsr vignette.

One way to test whether the XML file has downloaded correctly is to open it in a text editor – you could even do this from R with

file.edit("usa_00001.xml")

If the file downloaded and saved correctly, the first few lines should look something like this:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="ipums-ddi-xslt.xsl"?>
<codeBook ID="ddi2-27c79f70-8e14-11e5-8c97-b82a72e0b782-usa_00032.dat-usa.ipums.org" version="2.5" xmlns="ddi:codebook:2_5" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="ddi:codebook:2_5 http://www.ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd">

Hi Derek,

Thanks so much for getting back to me! I ended up using the CVS file since I had a tight deadline for this work, but I will absolutely keep this in mind if I need to use IPUMS again.

Best,
Allison