Not sure if this is the right place to post, but I’m giving it a shot… I’m attempting to merge ASEC longitudinal files with March BMS files for the last few years. Is it the case an individual would have a different pernum value in the March BMS versus the ASEC? For some people, I have two records, with the first record reflecting their March BMS responses; and the second reflecting their ASEC responses. This isn’t the case for everyone in the merged file. Am I doing something wrong? If it is true that pernum may differ, how should one work around this? Thanks very much!
Not everyone in the ASEC will have a match in the March Basic sample, but everyone in the March Basic sample should have a match in the ASEC sample. The best way to merge these two files is using CPSIDP instead of PERNUM. Individuals in the ASEC that do not match to the March Basic will have CPSIDP=0. These individuals are in the ASEC oversample.
That’s very helpful, many thanks! If I may follow up on this, which precise identifiers should be used in merging CPS (March) and ASEC? statefip year month cpsid cpsidp marbasecidp marbasecidh? I am seeking to merge 1990 - the latest waves available. Many thanks in advance!
To merge a March Basic file with the corresponding ASEC file at the person level, you should use YEAR, MONTH, and CPSIDP.
It seems like because cpsidp == 0 for many observations in ASEC, this is a merge that cannot be performed.
I could first split ASEC in ASEC_1 with cpsidp and ASEC_2 without cpsidp (i.e. cpsidp==0), and then merge ASEC_1 and append ASEC_2, but I was wondering whether there was another more straightforward solution, e.g. using year month cpsidp marbasecidp?
Ps. If I do that, it seems like there are about 8000 observations each in 2016, 2017, and 2018 in the CPS that cannot be matched / uniquely identified in the ASEC. As far as I understand, this may be a reason for concners.
MARBASECIDP will not give you any more information than CPSIDP. That variable predated the creation of CPSIDP and has been retained in the IPUMS database for continuity. The process that you described is the right approach for this, since as you mentioned the ASEC oversample all have CPSIDP=0 and cannot be linked to the March Basic sample (or any other samples). The observations in the basic CPS that do not link with ASEC in 2016-2018 were in the “split panel,” which was a survey experiment to test the 2014 income question redesign. You can read more about that on this thread: Discrepancies between number of obs in 2018 ASEC and March BMS and problem with matches in 2021 - #2 by Ivan_Strahof
Brilliant, many thanks. I wanted to be 100% certain than the splitting and stacking was the right approach, thanks so much for your help!