Any thoughts on how to apply the codebook formatting to the dataset using R, XML, or SQL?

christopherrbyrd · November 10, 2015, 7:47pm

I don’t have access to SPSS, SAS, or STATA. Would love your input. Note: I’m prepared to write custom code to rebuild the existing basic codebook.

Tim_Moreland · November 10, 2015, 11:26pm

Here is one common method for reading formatted IPUMS data into R:

(1) Select “STATA” as your data format on the extract request screen. This will generate a .dta file when you submit your extract.

(2) In R, use the following code with the name of your .dta file:

library(foreign)

df <- read.dta(“ipums_file.dta”)

This reads your IPUMS file into a dataframe named “df” with value labels applied. I recommend this resource for more information on using IPUMS data in R.

Hope this helps.

christopherrbyrd · November 12, 2015, 4:51pm

Thanks Tim,

The following update to your answer worked perfect:

install.packages(“readstata13”)

library(readstata13)

df <- read.dta13(file = “./some_dir/some_file.dta”)

enystrom · November 13, 2015, 2:41pm

I wrote a set of Bash/AWK scripts to transform the data and load it into SQLite. Details here: https://github.com/ericnystrom/napptools

Topic		Replies	Views
Hey Red, What's the easiest way to read my IPUMS USA data into R? USA	1	421	October 20, 2015
Is it possibel to use ipums data in R?	3	3313	February 15, 2020
Finding Command Files	1	367	November 3, 2020
Stata version, .do files and file metadata conversion issues CPS	2	611	June 14, 2018
Any chance that instructions for R users would be provided? HEALTH SURVEYS	2	471	March 10, 2017

Any thoughts on how to apply the codebook formatting to the dataset using R, XML, or SQL?

Related topics