Help with SAS code for merging records based on family interrelationships

SHochhalter · December 19, 2013, 6:22pm

I am working on creating a study sample using the South African 2001 Census and 2007 Community Survey data. The main variable of interest we are looking at is the number of children born alive who have died for each mother (CHBORN minus CHSURV). The sample we’d like to create would include all individuals identified as mothers, with "paternal’ (‘paternal’ as in the men who are fathers to her children) and child variables added on to each maternal record.

I found the sample SAS code for merging records based on family interrelationship available in the first End Note of the Family Interrelationships page to be very helpful. However, when I run this code, the new data set has a greater number of observations then the original. Can you help me figure out what is going wrong here?

Thanks,

Stephanie

grover · December 20, 2013, 2:10pm

I was not able to reproduce your problem using an extract I created. Please attach a file containing your syntax in a reply to this question or send a copy of you syntax file to ipums@umn.edu so that I can get a better understanding of what may be causing your problem.

SHochhalter · December 30, 2013, 11:57pm

“I was not able to reproduce your problem using an extract I created. Please attach a file containing your syntax in a reply to this question or send a copy of you syntax file to ipums@umn.edu so that I can get a better understanding of what may be causing your problem.”

Thank you for your reply.

I have attached a text file with my syntax in my reply. When I run each set of procedures to attach each spouse characteristic, I get a data set with more records than the previous data set.

Original extract: 4,773,312 records

Merge 1: 4,773,755 records

Merge 2: 4,775,194 records

Merge 3: 4,783,462 records

Merge 4: 5,196,790 records

I have more characteristics I want to attach, but I’ve stopped until I figure out what I am doing wrong here. Thank you for your help.

Merging syntax-1.txt (2.43 KB)
Merging syntax.txt (2.43 KB)

grover · January 3, 2014, 6:58pm

As I have been looking into this further, it appears that the problem is stemming from individuals in polygamous marriages. If you look at the variables age_sp and your merged variable sp_age you will see that about 870 individuals do not have matching values (even though everyone should), this is due to some men having multiple wives, but only one of them can be that man’s SPLOC value.

One way around this may be to create two separate datasets, one containing only women and another containing only men, then merge the men dataset onto the women dataset using the variable sploc from the women dataset and the variable pernum from the men dataset. Since you are interested primarily in women you could then discard the men dataset.
I hope this helps.

SHochhalter · January 4, 2014, 12:07am

Thank you so much for looking into this for me.

I’ve created two separate datasets, one containing only women and one containing only men, but I’m unclear on how exactly to merge the men dataset onto the women data set. Could you explain how to in a little more detail and perhaps attach a sample SAS code?

Again, that you for your assistance with this!

grover · January 6, 2014, 9:34pm

Basically you would use the same merging proceedure, but with the WOMEN dataset AS a and the MEN dataset AS b.

One thing to to differnetly, however, would be to rename the MEN variable pernum to sploc because now you will be merging on the WOMEN variable sploc. You should also remember to rename the MEN age variable to SP_AGE. Finally make sure both datasets are sorted on year, serial, and sploc

The merging procedure should look something like this:

CREATE TABLE IPUMS.merged1

AS SELECT a.*, b.SP_AGE label=“Spouse’s age”

FROM women AS a LEFT JOIN men AS b

ON (a.year=b.year AND a.serial=b.serial AND a.sploc=b.sploc) ;

QUIT ;

Topic		Replies	Views
What are the identifiers on the IPUMS CPS to merge files? CPS	3	798	December 2, 2013
How to create a sample of mothers and the father of her children and children for South Africa, 2001 and 2007? INTERNATIONAL	1	281	December 2, 2013
Family Interrelationships code INTERNATIONAL	3	330	August 2, 2016
Attention: Jeff Bloem/Michelle Pratt: Follow up on my earlier question. INTERNATIONAL	1	631	June 4, 2018
Children under 18 in CPS Household CPS	6	826	May 3, 2018

Help with SAS code for merging records based on family interrelationships

Related topics