Ages of youngest and oldest children greater than respondent's age

In the ACS I am working with one-year samples between 2019-2024. I noticed that for some respondents the age of the youngest own child and eldest own child were larger than the age of the respondent. I am looking for guidance for determining if these age differences are correct (perhaps because these are step or adopted children from their (older) spouse) or if they could be a product of any linking errors or other data quality issues. To explore this, I downloaded a revised extract with characteristics of the spouse: age_sp, yngch_sp, eldch_sp, qage_sp, and sprule.

I’m including a screen shot of some of these records. Based on my review of these cases it seems like for most of them the age differences could be because step or adopted children of an older spouse. And most of the links between the respondent and the spouse are clarity level 1, which I understand to mean that the match was obvious because there was not other potential spouse. So for many of them issues with the link may not be a significant concern.

But I am looking suggestions or guidance on how to evaluate cases like these for any linking or data quality concerns.

IPUMS creates the family interrelationship variables NCHILD, ELDCH, YNGCH, MOMLOC, POPLOC, SPLOC, and others based on algorithms that determine the most likely set of family interrelationships within a household. These variables are not self-reported, meaning respondents don’t answer who in the household their mother and father are, how many children they have in the household, and so on. We use RELATE (relationship to household head, which is directly reported by respondents in the ACS), marital status, age, sex, and order in the household roster to make these assignments. The algorithms used to create these assignments do identify non-biological relationships, such as those between step-parents and step-children or between adoptive parents and adoptive children. This means that in some cases, the age differences between people identified in these variables as parents and children may be very small, or may even be the reverse of what you would expect (i.e., the child is older than the parent).

As you have pointed out, the rules for assignments of mothers, fathers, and spouses are reported in the variables MOMRULE, POPRULE, and SPRULE, respectively. This white paper by Gorsuch and Williams on the construction of the family interrelationship variables provides more detailed information about how the links are assigned.

We are not aware of any significant data quality issues that need to be acted on. The family interrelationship variables constructed by IPUMS correctly identify the spouse in 99.99% of cases, and correctly identify the parent in 99% of cases. From parent identification, NCHILD, YNGCH, and ELDCH are constructed, meaning these variables have a high level of accuracy as well. Many family interrelationship variables are based on information in RELATE, which is self-reported, so these relationships should be accurately identified (e.g., if RELATE=2, the respondent is the spouse of the head of household, and SPLOC will point to the head of household). If you are only interested in biological relationships, there is unfortunately no way to directly identify these in ACS data. But you could certainly exclude cases where the age difference between the parent(s) and child(ren) seems improbable.