People, samples and lines

Thanks Grover! That’s very helpful.

In case you and Jeff have been dividing these questions, this question makes more sense in the context of this one:

I have really put too many questions in this one post. Sorry. So I will number them to make them easier to keep track of.

What I am trying to do now is to assemble a sample that includes all and only the ASEC, both as to people and as to variables (though I would kind of like some reassurance that there is little or nothing in the March Basic that is not in the ASEC Core). Later, I will want to link the Supplements to the ASEC. I am not currently planning on linking any Basic Monthly samples except those associated with the Supplements I have selected.

My actual samples included all of the ASEC, the default samples (all the Monthly’s in 2018) and three or four Supplements. Rather than redefining the samples and downloading them all again,download these all again, my theory was that if I take all and only the lines with ASECFLAG true, I would end up with counts that match those on the IPUMS CPS sample sizes page.

However I now see that ASECFLAG only exists back to 1976. Also, I only see Basic Monthly samples back to that year. So I am supposing that there is no distinct March Basic Monthly prior to that year, so using the ASECFLAG should get me the right answer. (1) Does this seem correct?

Then, for Supplement linking, I think I discard any record that has a year/month combination different from any of those in which those supplements occur. (2) Assuming that one of the Supplements occurs in a month that includes the Hispanic oversample, should I or should I not discard records marked by ASECFLAG? (3) Will there be records in these months with ASECFLAG marked, or are these persons transferred out of these Basic Monthly files when added to the ASEC?

(4) Would I be correct in concluding that (at least under your rectangular format) Supplements and the Basic Monthly variables for the same month occur in a single line (one line per person)? (5) And the Core variables also occur on the same line as the ASEC variables, correct? (still 5) So in my case it is only the 2018 March Basic that will duplicate core variables (because I included your default samples, which included only the 2018 March Basic), correct?

Finally, I understand that conceptually linking occurs between the ASEC and the March Basic (and the ASEC and the Basic for the Hispanic oversample), and then from the March Basic (and oversample?) to the other Basics. (6) But so long as I have your CPSIDP, I do not need to follow this path, correct? I can go directly from the ASEC to the matching person in the Basic for the Supplement month, which would put me on the same line as that person’s Supplement record, correct?

I have read the matching documentation couple of times, but I still have not fully wrapped my mind around it. (7) I have been assuming that CPSID is strictly analogous to CPSIDP, except for households rather than persons, and again, you can go straight from the ASEC to the Monthlies, which takes you to the Supplements on the same lines. () Is this correct? But then I read this in the CPSID tab: " In some cases, a household will appear fewer than 8 times due to migration, mortality, non-response, and recording errors." I suppose that if you tear the house down you could call that household mortality, but I don’t know what would constitute migration. I thought that if people moved out and new people moved in, the new people were in the same household. Is that wrong?