People, samples and lines

First, I have been assuming that March Basic variables identified as “Core” on your variable selection page, if selected, appear on the same line as the ASEC variables. Is this true? OR do they appear on separate rows that need to be extracted and merged?

If a line comes from a month other than March, can I assume that it is a part neither of the ASEC nor of the ASEC-associated Basic Monthly sample?

If a line comes from a month other than March, is there an easy way to tell if it comes from the Basic Monthly sample or one of the Supplements, and if the latter, which one? I observe that there are 15 Suppliments, so some months must have multiple Supplements.

If you select a Suppliment sample, and you have also selected variables from the ASEC-associated March Basic Monthly Core, will you automatically get those same variables from the March Basic Monthly associated with the month of the Suppliment?

I have been using the same sample selection for all my extracts, on the assumption that this will make them line up and I can just glue the columns together. But now I think I need to pull out all the rows that don’t come with the ASEC and then merge them back in, so that each line represents a single person-year combination.

Assume that one row per person-year (where a “year” actually refers to four consecutive months containing a March) is my goal, so that all the information (in the samples and variables I have selected) about a person during one 4-month interval ends up on a single row. (I am not looking at the Monthly samples except as they are associated with a supplement or the ASEC). This suggests that it does not make sense to put all the samples I am using in each extract, since I just need to pull them out again in order to merge them, creating an extra step. Does this seem right?

I am confused about how the Hispanic oversample plays in all this. I know that the ASEC Basic Monthly (Core) variables associated with this sample come from the preceding November . I suppose it means that MONTH alone is not a reliable way to distinguish whether an individual line in an extract is in or out of the ASEC Core, but I’m not sure that is all it means.

Since you are trying to create on row per person per year, I will start by pointing you to the IPUMS CPS Linking page. This should hopefully answer some of your questions about the nature of the short panel survey design used by the CPS and I will link to specific sections as I answer your questions in order:

  1. “Core” variables will appear on the same person record as ASEC variables for individuals in the ASEC sample. However, if your extract also includes the March Basic sample from the same year, all person records from the March Basic sample will also have these variables. This is unique to the ASEC supplement. Most IPUMS CPS supplements include the Basic Monthly in the same sample because they are composed of the same person records. This is not the case for ASEC supplements since they include oversamples. More info here.

  2. This is generally correct, except for the oversample in the ASEC, whose basic monthly respondents may be in an adjacent month. This should not effect estimates, however and you can safely identify individuals to a supplement using MONTH.

  3. This question gets back to the fact that, aside from the ASEC, all monthly samples include both basic and supplement respondents. Therefore, in the IPUMS CPS sample selection menu, when you select the December 2017 Basic Monthly sample you will notice that the 2017 Food Security Supplement sample is also selected because these are actually the same sample. And you are correct that some months include multiple supplements, and there are also some supplements that have been retired (e.g. the Agricultural Worker supplement is not available past 1987).

The rest of your questions are best answered by the IPUMS CPS Linking page, but I will give a brief description of the way IPUMS CPS records relate to samples here to address you assumptions. An IPUMS CPS sample represents the respondents to a monthly questionnaire. So, even though a single respondent is represented in multiple months, each of their responses is a unique record in IPUMS CPS. In order to represent all responses from a single person in a given year you will need to follow the instructions on that linking page.

I hope this helps.

Thanks Grover! That’s very helpful.

In case you and Jeff have been dividing these questions, this question makes more sense in the context of this one:

I have really put too many questions in this one post. Sorry. So I will number them to make them easier to keep track of.

What I am trying to do now is to assemble a sample that includes all and only the ASEC, both as to people and as to variables (though I would kind of like some reassurance that there is little or nothing in the March Basic that is not in the ASEC Core). Later, I will want to link the Supplements to the ASEC. I am not currently planning on linking any Basic Monthly samples except those associated with the Supplements I have selected.

My actual samples included all of the ASEC, the default samples (all the Monthly’s in 2018) and three or four Supplements. Rather than redefining the samples and downloading them all again,download these all again, my theory was that if I take all and only the lines with ASECFLAG true, I would end up with counts that match those on the IPUMS CPS sample sizes page.

However I now see that ASECFLAG only exists back to 1976. Also, I only see Basic Monthly samples back to that year. So I am supposing that there is no distinct March Basic Monthly prior to that year, so using the ASECFLAG should get me the right answer. (1) Does this seem correct?

Then, for Supplement linking, I think I discard any record that has a year/month combination different from any of those in which those supplements occur. (2) Assuming that one of the Supplements occurs in a month that includes the Hispanic oversample, should I or should I not discard records marked by ASECFLAG? (3) Will there be records in these months with ASECFLAG marked, or are these persons transferred out of these Basic Monthly files when added to the ASEC?

(4) Would I be correct in concluding that (at least under your rectangular format) Supplements and the Basic Monthly variables for the same month occur in a single line (one line per person)? (5) And the Core variables also occur on the same line as the ASEC variables, correct? (still 5) So in my case it is only the 2018 March Basic that will duplicate core variables (because I included your default samples, which included only the 2018 March Basic), correct?

Finally, I understand that conceptually linking occurs between the ASEC and the March Basic (and the ASEC and the Basic for the Hispanic oversample), and then from the March Basic (and oversample?) to the other Basics. (6) But so long as I have your CPSIDP, I do not need to follow this path, correct? I can go directly from the ASEC to the matching person in the Basic for the Supplement month, which would put me on the same line as that person’s Supplement record, correct?

I have read the matching documentation couple of times, but I still have not fully wrapped my mind around it. (7) I have been assuming that CPSID is strictly analogous to CPSIDP, except for households rather than persons, and again, you can go straight from the ASEC to the Monthlies, which takes you to the Supplements on the same lines. () Is this correct? But then I read this in the CPSID tab: " In some cases, a household will appear fewer than 8 times due to migration, mortality, non-response, and recording errors." I suppose that if you tear the house down you could call that household mortality, but I don’t know what would constitute migration. I thought that if people moved out and new people moved in, the new people were in the same household. Is that wrong?

There are actually a number of variables that are a part of the March Basic file that the BLS does not include in the corresponding ASEC Supplement file. You can identify these variables by selecting a March Basic and ASEC sample from a single year and exploring the “Core” variables groups. In particular, the “Core > Work” variable group has many multi-job and “past week” related variables that are not a part of the ASEC file, presumably because the ASEC file is meant to reflect Annual data.

Q1.

Yes, selecting records based on ASECFLAG should result in a file with only ASEC Supplement data, which you can then group by year to return ASEC P-record counts.

Q2.

Records in non-ASEC files will not have an ASECFLAG value. The ASECFLAG variable is specifically for distinguishing ASEC records from March Basic records if both are included in a file. You would instead want to drop records in the ASEC file that are flagged as being a part of the oversample using the ASECOVERP flag, as it is not possible to link these individuals across files with CPSID/P.

Q3.

Individuals that are in the ASEC Oversample are not easily identifiable in there Basic Monthly counterparts, so again I would recommend removing them from the ASEC files with the ASECOVERP flag.

Q4.

Yes, except in the ASEC/March Basic case.

Q5.

Correct on all counts.

Q6.

Save for the Oversamples (as mentioned above) this is true. So as long as you drop the ASECOVERP cases you should be able to link directly using CPSIDP. The resulting links are based on the record identifiers provided by BLS in the public use files and do include incorrect links so we also recommend that you verify links using SEX, RACE, and AGE (with a reasonable tolerance) as respondents are unlikely to change these values from month to month.

Q7.

Your initial assumption is correct, households should receive the same identifier regardless of the individuals within them, however some households do get omitted from files due to non-response, recording errors, etc. I believe the parts about mortality and migration were included erroneously. I will let the IPUMS CPS team know so that they can reword that section.

I hope this helps!