Hello IPUMS Office!
I want to identify Mexican-American households in the 2010-2011 ACS in AZ, CA, CO, NM, and TX that contain both adult children (aged 18+) and their aging parents (aged 79+).
Specific questions:
- Should I be using MOMLOC/POPLOC in addition to RELATE to verify parent-child relationships?
- Is this the correct approach to ensure both an adult child and an aging parent are present in the same household?
- Do I also need to add group_by(SERIAL) to ensure they are in the same HH?
- Is there a benefit to using one approach over the other?
Thank you for any guidance!
VERSION #1:
1.) Filter for Mexican-American households (extract already restricted by years & states)
mexican_hhs <- acs %>%
filter(HISPAN == 1) # Mexican ethnicity
# n = 405,979 (unweighted)
mexican_hhs_n <- sum(mexican_hhs$PERWT) # n = 47,021,234 (weighted)
2.) Identify households with adult children and aging parents
> acs_ma <- mexican_hhs %>%
> filter(GQ %in% c(1, 2)) %>% # Exclude group quarters
> group_by(SERIAL) %>% # Group by household
> filter(
> any(RELATE == 3 & AGE >= 18), # At least one adult child
> any(RELATE == 5 & AGE >= 79) # At least one aging parent
> ) %>%
> ungroup()
# Unweighted n
# n = 2,408 (unweighted)
# Weighted n
acs_ma_n <- sum(acs_ma$PERWT) # 266,826
VERSION # 2:
> acs_ma2 <- mexican_hhs %>%
> filter(GQ %in% c(1, 2)) %>% # Exclude group quarters
group_by(SERIAL) %>%
> filter(
> any(RELATE == 3 & AGE >= 18 & (MOMLOC > 0 | POPLOC > 0)) & # At least one adult child
> any(RELATE == 5 & AGE >= 79) # At least one aging parent
> ) %>%
ungroup()
>
> # Unweighted n
> # n = 2,408 (unweighted)
>
> # Weighted n
> acs_ma2_n <- sum(acs_ma2$PERWT) # 266,826
Thanks in advance for any insight!
One follow-up question I have is that with both versions of the df, when I restrict the sample to adult children only (RELATE == 3 & AGE >= 18) and collapse MARST into a dummy (1|2=married; all else=no), only 6.2% of the sample is married, which seems very low. However, this is not improbable given that this is a sample of adult children living together with their aging parents. Also, the age distribution of these adult children skews young (mean = 25).
## Gen married dummy
acs_ma <- acs_ma %>%
mutate(
married_dummy = case_when(
MARST %in% c(1, 2) ~ 1, # Married (spouse present or absent)
MARST %in% c(3, 4, 5, 6) ~ 0, # Not married (separated, divorced, widowed, never married)
TRUE ~ NA_real_ # Handle missing values
)
)
## Check the Distribution of MARST
adult_children_acs %>%
as_survey_design(weights = PERWT) %>%
group_by(MARST) %>%
summarise(
count = survey_total(),
percent = survey_mean()
)
A tibble: 6 Ă— 5
MARST count count_se percent percent_se
<int+lbl> <dbl> <dbl> <dbl> <dbl>
1 1 [Married, spouse present] 2780 911. 0.0392 0.0126
2 2 [Married, spouse absent] 1623 440. 0.0229 0.00623
3 3 [Separated] 1978 589. 0.0279 0.00826
4 4 [Divorced] 3453 760. 0.0487 0.0106
5 5 [Widowed] 196 114. 0.00276 0.00162
6 6 [Never married/single] 60946 2026. 0.859 0.0183
Hi Anna,
Thanks for reaching out; it’s good to hear from you! I hope that I can clarify the data better to help you decide how to best proceed with this project.
RELATE reports the relationship of each person in the household to the householder (identified with PERNUM = 1). The householder can be anyone in whose name the housing unit is owned or rented under. Unlike RELATE, which comes directly from the public use microdata file, MOMLOC and POPLOC are created by IPUMS to aid researchers in analyzing parental relationships. There are two main differences in the links provided by MOMLOC/POPLOC when compared with RELATE:
- They include (probabilistic) parental relationships where neither person is the householder.
- They will report the unmarried spouse of a child’s parent as their parent (in addition to including step- and adoptive parents).
I highly recommend reviewing the detailed documentation pages for MOMLOC/POPLOC to determine if you would like to leverage this additional information. Note that these parental links are used for our attach characteristics tool (available in the last screen before submit your extracts). Using this tool, you can use the variables in your data cart to add data about a person’s mother, father, household head, or spouse as an additional column in your data extract. For example, you can append mother’s and father’s age to the person’s record for analysis.
Your restriction to GQ = 1 and 2 misses a few households with GQ = 5. These are households containing 10 or more individuals unrelated to the household head. Note that SERIAL is only unique within each IPUMS sample; serial numbers will repeat between your 2010 and 2011 samples. Instead, you should group using a combination of SAMPLE and SERIAL.
I’m finding that 10.6% of people aged 18+ with Hispanic Mexican origin living with their mother (MOMLOC > 0) in 2011 were married, while the corresponding figure for those living with their father is 9.6%. I also found a mean age of 25.03. This seems consistent with your estimates.
Hi Ivan,
It’s good to hear from you too & thanks for getting back to me! I wasn’t aware that MOMLOC/POPLOC identifies parental relationships where neither person is the householder. If possible, it would be great to identify shared households where either the older parent or the adult child is the householder – is this possible, i.e., to link the parents to their children in the HH, and additionally filter for either of them being the householder? Since I posted my original question, I’ve been playing around with my code, and I wonder what your thoughts are on this updated attempt to filter for Mexican-origin Hispanic adult children (18+) living with their parents (80+)? No rush and thanks for any insight!
## 1.) Generate unique HH identifier
# Extract already restricted to 2010-11 & AZ, CA, CO, NM, TX
# SERIAL is only unique within each year
# A combination of SAMPLE and SERIAL provides a unique identifier for every HH
acs$hh_id <- paste(acs$SAMPLE, acs$SERIAL, sep = "_")
## 2.) Filter for Mexican-origin HHs ---------------------------
mexican_hhs <- acs %>%
filter(HISPAN == 1) # Mexican ethnicity
# n = 405,979 (unweighted)
mexican_hhs_n <- sum(mexican_hhs$PERWT) # n = 47,021,234 (weighted)
## 3.) Create subset of HHs with adult children and their older parents ---------------------------
acs_mex2 <- mex_hh %>%
group_by(hh_id) %>%
filter(GQ %in% 1:2 | GQ == 5) %>% # Exclude group quarters but allow HHs with 10+ unrelated individuals
mutate(
mother_age = if_else(MOMLOC > 0, AGE[match(MOMLOC, PERNUM)], NA_real_), # If MOMLOC > 0 (mother lives in same HH), then find her age; match(MOMLOC, PERNUM) finds the row where PERNUM equals MOMLOC; AGE[match(...)] gets the age of that person (the mother)
father_age = if_else(POPLOC > 0, AGE[match(POPLOC, PERNUM)], NA_real_) # Same for dads
) %>%
filter(
any(AGE >= 18 & (mother_age >= 80 | father_age >= 80)) & # Ensure an adult child (18+) lives with a parent (80+)
any(RELATE == 1 & (AGE >= 18 | mother_age >= 80 | father_age >= 80))) %>% # Ensure either adult child OR parent is the householder
ungroup()
# Unweighted n
# n = 8,440
# Weighted n
acs_mex_n <- sum(acs_mex$PERWT) # n = 877,763```
I can comment about specifics that you should be aware of for your analysis, but confirming whether your code is correct is beyond the scope of our user support team. Having scanned through your code, I can share two additional thoughts that come to mind:
- IPUMS USA provides a tool that attaches characteristics of a person’s coresident parent(s) or spouse as a new variable on that person’s record (e.g., mother’s education, spouse’s employment status); these variables use the IPUMS-generated family interrelationships reported in MOMLOC/POPLOC and SPLOC. This tool is available in the Extract Options window right before you submit your extract for processing.
- PERWT is constructed to sum to the total US resident population in each year. Therefore, your weighted n estimates that use two years of ACS data are about double the size of your population of interest. While this does not pose any issues for regression analysis or constructing ratios, estimates of population-level aggregates (e.g., the total number of Mexican-origin Hispanic adult children living with their parents in the US) should be divided by two.
This is helpful. Thank you!
1 Like