Hi I am currently pooling together CPS ASEC data from 2013-2015, and I had a couple questions.
- When pooling CPS ASEC data how do you account for duplicate respondents?
I used CPSID to account for duplicate respondents, and managed to only keep the most recent respondents if there were duplicates. Does this method sound right?
- Additionally, when using weights for pooled data I basically, divided the weights for my analysis by 2, since I’m using data from 2013-2015, does this method also sound right?
Any single month of the CPS is meant to produce nationally and sub-nationally representative estimates when weights are used. Therefore, when you pool observations from the 2013, 2014, and 2015 ASEC, your estimates will be the sum of the estimates from each of the three individual years. You will need to divide your weights by three to produce an average for the three year period.
Another result of pooling multiple years of the ASEC together are repeated observations of the same individual across years due to the panel design of the CPS. While individual respondents can appear across multiple ASEC samples (e.g. ASEC 2014 & 2015), there are no duplicate respondents within any single CPS ASEC sample (e.g. within the 2013 ASEC). For reference, CPSID is the unique identifier for households in the CPS and CPSIDP is the unique identifier for individuals in the CPS. Each individual within a household will have the same CPSID value. Note also that all participants of the ASEC oversample (as indicated by ASECOVERP) will have a CPSIDP value of 0 since they cannot be linked across samples. Whether to retain repeated observations of the same individual across years is up to the researcher; it is worth consulting the literature in the field to see what others have done with an eye towards the specific application.