Is it Ok to merge two ATUS data only using the variable "CPSIDP?"

Dear Madam/Sir,

I have two ATUS-X extracts. Both of them have “CPSIDP.” Is it OK to merge these two data just using “CPSIDP?” “CPSIDP” uniquely identifies the respondent, right? Thank you!

CPSIDP should uniquely identify individuals. However the recommend variables for merging cases are YEAR, CASEID, and PERNUM, which should all be included in your extracts automatically.

I hope this helps.

Hi, I’m going to merge datasets from several years, using variables across the extraction categories.

  1. When I do a test merge of two small datasets, A:(sex) and B:(sex+food prep), both from 2003 only, I see that PERNUM=1 for all observations in each of the individual datasets. Is that peculiar to my test datasets, the year 2003, or both?

  2. CASEID values begin with the year, in my case 2003. Does that mean I don’t need to use YEAR as a merge variable when merging data from multiple years?

thanks much, Sanjiv

I’ll try to answer your questions one at a time.

(1) If your IPUMS Time Use data extracts only include the “respondents” ATUS file, then by definition, the only individuals included in the file will be those with PERNUM==1. You can verify this detail by looking at the codes tab of the PERNUM variable and selecting different radio buttons for “respondents” or “respondents and household members,” etc. Depending on your ultimate goal, you may want to also include household members in your data extract.

(2) Yes, since you are only using one year of ATUS data, merging on YEAR and CASEID will be redundant.

1 Like

Thanks Jeff. Re: (2), I’m asking about merging data from multiple years–in that case, do I need YEAR, or is CASEID sufficient since it begins with the year? The scenario is that I start by extracting data for all available years and some variables, then later decide to add other variables. In that case will I need to use YEAR? thanks much, Sanjiv

Ah, sorry. I misunderstood your question. Since CASEID begins with the survey year, merging on CASEID and YEAR will be redundant.

1 Like