Missing data to replicate a paper - USA 1850 100%

Good morning,

I am trying to replicate a paper by Collins and Zimran (2018) which uses IPUMS NAPP data, from the 1850 and 1880 US census. They provide a replication package including the notebook for the each census. When I look at the 1850 NAPP (or rather IPSUM International) US 100% sample, there are many variables they include that I cannot find. The complete list of the variables that they use are:

Variable Columns Len US50
SAMPLE H 1-4 4 X
SERIAL H 5-14 10 X
CNTRY H 15-17 3 X
YEAR H 18-21 4 X
PERNUM P 22-25 4 X
PERWT P 26-29 4 X
AGE P 30-32 3 X
SEX P 33 1 X
BIRTHYR P 34-37 4 X
AGEMONTH P 38-39 2 X
NATIVITY P 40-41 2 X
BPLCNTRY P 42-46 5 X
BPLUS P 47-48 2 X
RACE P 49-50 2 X
NAMELAST P 51-82 32 X
NAMEFRST P 83-114 32 X
PERNUM_HEAD P 115-118 4 X
PERNUM_MOM P 119-122 4 X
PERNUM_POP P 123-126 4 X
NATIVITY_MOM P 129-130 2 X
NATIVITY_POP P 131-132 2 X
BPLCNTRY_MOM P 138-142 5 X
BPLCNTRY_POP P 143-147 5 X
BPLUS_HEAD P 148-149 2 X
BPLUS_MOM P 150-151 2 X
BPLUS_POP P 152-153 2 X

But I cannot find any of the ones after NAMEFRST (so all the personal information on the personal numbers, nativity, country of birth, etc… for the individual fathers, mothers, etc…). I have checked with the US 1% sample, even the IPUMS USA dataset. No luck.

I have also checked the list of all variables for each census (harmonized and source) and still I cannot find these missing variables. Is there something I am doing wrong?

Thank you very much for your time,


The variables you mention appear to have been created by the researchers with the IPUMS attach characteristics tool. This tool creates a new variable that reports a selected characteristic of one of the respondent’s available relations. You can find the tool on the extract request page after viewing your data cart (see this video tutorial).

It appears the researchers added the person number (PERNUM), nativity (NATIVITY), country of birth (BPLCNTRY), and the state of birth (BPLUS) of the respondent’s household head (HEAD), mother (MOM), and father (POP). These are based on the links found in the variables MOMLOC and POPLOC. These probable family links were imputed by IPUMS using available data as described in this guide. You can find further information on the type of link imputed in MOMRULEH and POPRULEH. Note that you can alternatively derive mother’s and father’s birth country and state in the original source variables MBPL/FBPL and MBPLUS/FBPLUS.