I’ve been trying to link consecutive years of the ASEC using CPSIDP, from 2013-17. I’ve dropped the oversample and also restricted the dataset to ages 16-54. This leaves about 40,000 people potentially eligible to link each year (accounting for the fact that only half of each year’s sample could be a linked a year forward), but only about half of these (20,000) successfully link each year. Does this seem right?
I just looked into this myself and found that there were close to 40,000 matches in years 2014 through 2016. Match rates in 2013 and 2017 are roughly half of this (roughly 20,000) because your extract does not include records from 2012 or 2018 in which to match.
We have some relatively new resources to help users with linking issues. The first is the general Linking and the CPS information page. This includes basic documentation along with a visualization of what samples are linkable with other samples. Additionally, we just hosted an entire workshop on linking in the CPS. All of the presentations and exercises associated with this workshop are posted here. For your task, I’d suggest looking at this exercise with Stata syntax here.
Thanks Jeff. Just to be clear, did you restrict the ages to 16-54? I actually attended that workshop so I feel fairly confident that I’m doing this the right way, but wanted to check since the match rate was so low. I just re-ran this including those age 55+ and the matching was between 34-40k per year. But still getting about 20k for age 16-54.
Yeah, I did restrict to the age range you are looking at. Do you mind sending an email to firstname.lastname@example.org with the code you are using to make these matches? Maybe this will reveal the source of the discrepency.