I have read several posts on the forum, as well as this paper https://pop.umn.edu/sites/pop.umn.edu… …and I am wondering which mistake I am making if - when I want to match people appearing in 2 consecutive ASEC samples - I just download variables for 2 consecutive ASEC samples, then sort the database by CPSIDP and Year, and keep only those observations such that CPSIDP(year t)=CPSIDP(year t-1). Notice that I am interested only in variables (INCFARM,INCBUS,INCWAGE,IND50LY) that are only in the ASEC (which are all about income and industry in the previous calendar year).
This seems too easy, and I am guessing I am making some elementary mistake.
A few notes might be helpful. First, if you download two IPUMS CPS samples, these two samples will be appended to each other. This means that for individuals who are in both samples, they occupy two rows in your dataset; one per year. This is one key reason why the method you describe above does not work correctly. Second, the paper you linked to is all about linking ASEC samples to non-ASEC (basic monthly) CPS samples. That process is a bit more involved than what it sounds like you are trying to do. Third, you can link the individuals who are in two consecutive ASEC samples by using the variable CPSIDP, which identifies individuals across CPS samples. Following these steps should get you close to where you want to go:
(1) Download the two consecutive ASEC samples individually. (i.e. one at a time.)
(2) In each sample rename variables so that they are identifiable when merged with the other year. (e.g. AGE16 and AGE17.)
(3) In STATA or your preferred statistical software, perform a merge using CPSIDP as the linking variable.
(4) Verify linkages with AGE, SEX, and RACE.