Linking CPSIDP Non-Consecutive Months


I am trying to link individuals across February 2020 and April 2020 basic monthly CPS samples. I want to perform a time series analysis for changes in individuals’ employment from T1 (Feb. 2020) to T2 (April 2020).

I downloaded the samples separately and then tried to merge them using the command merge 1:1 CPSIDP using _data file with Stata 16. However, the results only show me “matched (3)” in one month. And when I scroll down the data file to April, the merge variable says using “2 only.” Any suggestions would be greatly appreciated!

I would expect your merge to give some results of 1 (master only), some 2 (using only), and some 3 (matched). Only individuals who were in rotation groups 1, 2, 5 or 6 in February would be linked to a survey in April. And those same people would be in rotation groups 3, 4, 7 and 8, respectively, in April. Individuals in rotation groups 3, 4, 7, or 8 in February, or 1, 2, 5, 6 in April, will not have a match. You can read more about the CPS rotation pattern here. Please follow up if this isn’t consistent with what you’re seeing.

I’ll also note that since you’re interested in time series analysis, it might be better to use the data in long format, and make use of Stata’s time series or panel data commands. If you create a single extract with both samples, it will automatically be in long format.

Thank you for your response.

Yes, about 40% of the sample matched, which I expected when using this resource: IPUMS CPS Rotation Pattern Explorer.

I figured out that I needed to rename variables in the April sample so they would merge probably. I added a “_4” when renaming each variable in the April sample. That way, the “matched (3)” had one row per person with variables from March and April 2020 (wide format).

However, I would like to keep the data in a long format, but I am unfamiliar with the process. If I create a single extract with both February and April monthly samples. How can I link CPSIDP’s in February and April? In other words, how can I drop individuals from the sample that are not in both February and April?

Thank you for your help!


Since you’re working in Stata I’ll give you some sample code that will drop records for people with only a single survey (assuming you have only two months in your extract):

bys cpsidp: egen numobs = total(1) 
keep if numobs==2

For panel analysis, use the xtset command, prior to using panel estimation commands:

xtset cpsidp mish 

I’d recommend taking a look at the training materials on linking the CPS, especially the part on link validation. There is sample code there as well.