For maximum power, I’m planning to use both the ACS and CPS samples in years in which they overlap. One worry is that the same household is sampled twice (very unlikely, but I’d like to address this possibility in my paper). If this were to happen, would this household have the same serial number in the ACS and CPS so that I could identify it?
The variable SERIAL is generated to uniquely identify households within a sample, and therefore is not able to identify the same households from the ACS to the CPS- or vise versa. Since the ACS and CPS data come from different samples it is impossible to know if the same household is included in both samples or simply just one. Although you are likely correct that it is highly improbable that a household is in both samples it is very difficult to know definitively. This is because a lot of effort is put forth by the Census Bureau and the BLS to ensure that you cannot identify specific households or people in public use files. We generally advise against combining ACS and CPS samples when they come from the same year. Although many users use the CPS to bridge the gaps between decennial census years.
Thanks for your answer! Could you please explain why you advise against combining datasets from different years?
I wish we had better documentation on this (I hope we can add this in the future), but it is largely due to the concern you are aiming to address re: potentially double counting households. To the best of our knowledge there is no way to ensure that pooling the ACS and the CPS in the same years doesn’t include the same household twice. Although, as you point out, the likelihood of this is very slim. Of course, this is just our advice, you don’t have to follow it