For the adjacent years for which IPUMS has not been able to construct a CPSID or CPSIDP to do the individual or household match, is this simply because the Census has not provided its version of household or individual identifiers for those years?
Or to put the question differently, is it still the case that there is a (slightly less than) 50 percent match of households across these years, if only they could be identified in some other way? Or is the absence of overlapping identifiers from the Census in these years because the 4-months on, 4-months off rhythm of the ASEC itself has been disrupted?
The specific challenge to linking between a given set of samples varies slightly. For example, from 1962 - 1978 person-level identifiers do not reliably identify the same individual in multiple samples. In the years 1984 - 1985, 1985 - 1986, 1994 - 1995, and 1995 - 1996 changes in the numbering schemes for housing units prevents linking between these samples. Additional detail about these challenges can be found on page 126 of the JESM article by Drew et al. (2014). Finally, last summer we hosted a workshop all about using the CPS as a panel. Most of the presentations and laboratory exercises are available here and might be useful.
Thanks Jeff!
I’ve been thinking about the fact that households are really the people who live in a specific physical structure, even if there is complete turnover of huan beings. I have been contemplating exploring how far you can get by looking at the smallest available unit of geography common to two years, and then matching on all the characteristics that _aren’t_ related to the people who live there, but are likely to remain constant for the building, like Month in sample plus or minus four, GQ, MSAPMSZ, FARM, OWNERSHP, HOUSRET, PROPTAX, PUBHOUS, UNITSSTR, FUELHEAT, and PHONE, whatever subset of those are shared over the years in question. It seems like that should get you down to groups that are small enough that matching on personal characteristics could be more reliable.
I’m reinventing the wheel here, aren’t I?
I am not sure I fully understand the objective here. From the sounds of it, it seems like this procedure would give you pretty much the same thing that CPSIDP gives–provided that invalid links are cleaned out of the data (see this presentation for details on link validation). I could be totally be misunderstanding your question here. So feel free to clarify if necessary.
I intend to construct a synthetic panel by doing statistical matching of people in MISH 5-8 to the most nearly similar people in MISH 1-4 in the same year. However, this only works if I can match the 5-8s to themselves in the previous year, and the 1-4s to themselves in the following year. I am trying to develop a stratagy to prevent unmatchable years from breaking the chain.
I read the documentation reporting unmatchable samples as constraints on the periods that IPUMS could construct CPSID & CPSIDP matches. If that is not right and IPUMS has already built links over those years, I’d be very happy to use yours. I’m sure y’all would do a better job than I could.