I am trying to merge person files between March and June for longitudinal person-level analysis–using the syntax below (the code below selects cases that are interviewed for the first time in March 2020 or for the fifth time in March 2020). When I run the code below, I find that the size of the sample increases over time (this is true even if I only select cases with MISH of 1-4). How is it possible that more people are in the sample in month 2 than in month 1?
data temp1; set ipums.recodedvariables2020;
longvar=0;
if month=3 and mish=1 then longvar=1;
if month=4 and mish=2 then longvar=1;
if month=5 and mish=3 then longvar=1;
if month=6 and mish=4 then longvar=1;
if month=3 and mish=5 then longvar=1;
if month=4 and mish=6 then longvar=1;
if month=5 and mish=7 then longvar=1;
if month=6 and mish=8 then longvar=1;
MONTH Frequency
March 8337
April 9568
May 10140
June 10277
This upward trend in sample size is expected based on a decline in nonresponse as households progress through the CPS rotation pattern; see this Census Bureau Basic CPS Household Nonresponse Page).
Your question is on response rates, but I want to make sure you are aware of the linking resources available through IPUMS. In order to link individual data longitudinally, your best option will be to use CPSIDP (more information in this documentation). It is important to note that CPSIDP linkages should be verified with AGE, SEX, and RACE in order to avoid erroneous links related to errors in the source data. When you are ready to apply sample weights to your analysis, the appropriate weight you will need is LNKFW1YWT, which is only included for the first linked observation (MISH 1, 2, 3, or 4). More information on longitudinal weights is available on this page. In the summer of 2018 we hosted a workshop focused on linking CPS data, a list of available resources from that workshop is available here.