Families, secondary families, and subfamilies

How do I get an accurate count of families? I read an post from a few years ago–very helpful–about using the PERNUM = 1 and the subfamily reference person to identify families and subfamilies. What about secondary families? They don’t appear to have anything analogous to PERNUM = 1 or the SFRELATE reference person. Many secondary families seem to be individuals sharing a household. Do I count them as families?
I believe I can create a distinct “family ID” from SERIAL_FAMUNIT_SUBFAM. Will a count of those distinct “family IDs” be the best count of families?
Thanks for your help. I really appreciate the fast download processing. Please keep up the great work.

How to define and count families is specific to your research application and the corresponding literature; these are analytical decisions that we leave up to each individual researcher. However, I can share some information about variables that you may want to leverage in defining families. A key consideration is if the variable is generated using IPUMS-defined families or families as defined by the Census Bureau; these two do not always align and you may get confusing results if you use variables from both. I am linking to a page on differences between families and sub-families in IPUMS USA that may be helpful.

Your approach of leveraging SERIAL, FAMUNIT, and SUBFAM makes sense to me (FAMUNIT and SUBFAM both use IPUMS-derived family definitions). Note that there is not a head or reference person identifier in the unrelated subfamilies, so you will need to simply choose one person per unrelated subfamily if you retain them in your counts. I would also encourage you to be intentional about how you handle individuals in group quarters (IPUMS USA variable GQ) and unrelated individuals in your counts.

Please follow up with further questions.

1 Like

Kari,

Thank you for your response.

I’m looking at ACS 2021 5-yr. estimates for Vermont, GQ 1 and 2. I have 30,211 records.

When I count households—PERNUM = 1 with HHWT—I get 262,516, which matches data.census.gov (262,514).

However, when I count families—PERNUM = 1 and FAMSIZE > 1 with HHWT—I get 173,138, which is quite a bit more than data.census.gov total families: 156,687. (And this would appear to be missing second, third, and fourth families within households because there are no cases where PERNUM = 1 and FAMUNIT > 1.)

The totals are off for every family size:



Vermont

|



|



|

  • | - | - |


    Label

    |

    data.census.gov

    |

    IPUMS

    |


    2-person families:

    |

    87,976

    |

    99,743

    |


    3-person families:

    |

    32,943

    |

    34,994

    |


    4-person families:

    |

    24,681

    |

    25,618

    |


    5-person families:

    |

    7,529

    |

    8,402

    |


    6-person families:

    |

    2,598

    |

    2,838

    |


    7-or-more-person families:

    |

    960

    |

    1,543

    |




    |

    156,687

    |

    173,138

    |

The Vermont Legislature is considering repeal of a newly passed state child tax credit, so I’m trying to get a count of families with children under six and family income up to $125,000. I would like to get an accurate count of families, although for the purposes of this analysis, I can use households.

I’m grateful for any suggestions, and I’d like to understand what I’m doing wrong on the family counts.

Thank you again for your help.

Jack

~WRD0004.jpg

Thanks for the quick response. I am not certain which table you are using, but will note that in the ACS Subject Definitions for 2021 the Census Bureau specifies “A household can contain only one family for purposes of tabulations.” This explains why your estimates exceed those in the published tables (e.g., S1101). There are other tables restricted to sub-families (e.g., B11014) that may be of interest to you.

A quick note that when estimating family income, FTOTINC has some quirks in its calculation, so I recommend calculating this directly (e.g., summing INCTOT for all individuals with the same SERIAL and FAMUNIT value).

Kari,

I think I found the discrepancy.

I went to the PUMS data and got the same results as the published ACS tables. Table S1101 shows household count of 262,514 and families at 156,687. Using the PUMS data, I got 262,516 and 156,274.

The Census defines a family as: A family is a group of two people or more (one of whom is the householder) related by birth, marriage, or adoption and residing together; all such people (including related subfamily members) are considered as members of one family.

With the PUMS data, I filtered for SPORDER (PERNUM) = 1 and NPF (Number of persons in family) > 1, which gave me the same weighted total as Table S1101.

Using the IPUMS, when I filter for PERNUM = 1 and famsize > 1, I get 173,138 weighted families.

When I compared the PUMS dataset with the IPUMS dataset, I found records in the PUMS data where the number of persons in family (NPF) was NULL, but in the matching records in the IPUMS data, FAMSIZE was not NULL.

I didn’t dig into the relationship, but some cases appeared to be just two individuals living in the same household. It appears IPUMS counted them as a family of two, but PUMS didn’t count them as a family. Does IPUMS not use the NPF variable for FAMSIZE? If not, how do you derive family size? It would be good to know if IPUMS uses a different calculation for this variable.

I hope this is helpful.

Jack

The IPUMS USA variable FAMSIZE is based on IPUMS family definitions, not Census Bureau definitions (see comparability tab of FAMSIZE). The 2021 PUMS data dictionary indicates that NPF only includes “family-households” (see page 82 of the 2021 ACS subject definitions PDF for definition); I suspect that your “overestimate” of families is that you are including non-family households as per the Census Bureau definition. I think you can leverage the variable CBHHTYPE to get at this.

When look at the weighted count of cases where PERNUM = 1, FAMSIZE > 1, STATEFIP = 50 & CBHHTYPE is 1, 2, 6, 7, 10, or 11 (note this excludes cohabitors), I get 156,522.

Kari,

Thank you. The CBHHTYPE is helpful. I get different results. Here are my numbers using IPUMS data (PERNUM = 1, FAMSIZE > 1, CBHHTYPE 1,2,6,7,10, or 11.)

Census bureau household type (with cohabiting)

Frequency

Percent

Valid Percent

Cumulative Percent

1

Married couple household with own children <18

39,672

15.1

15.1

15.1

2

Married couple household, NO own children <18

83,201

31.7

31.7

46.8

3

Cohabiting couple household with own children <18

5,303

2.0

2.0

48.8

4

Cohabiting couple household, NO own children <18

18,673

7.1

7.1

55.9

5

Female householder, no spouse/partner present, Living alone

44,106

16.8

16.8

72.7

6

Female householder, no spouse/partner present, with own children <18

9,373

3.6

3.6

76.3

7

Female householder, no spouse/partner present, with relatives, NO own children <18

9,367

3.6

3.6

79.9

8

Female householder, no spouse/partner present, only nonrelatives present

4,238

1.6

1.6

81.5

9

Male householder, no spouse/partner present, Living alone

36,851

14.0

14.0

95.5

10

Male householder, no spouse/partner present, with own children <18

3,071

1.2

1.2

96.7

11

Male householder, no spouse/partner present, with relatives, NO own children <18

4,446

1.7

1.7

98.4

12

Male householder, no spouse/partner present, only nonrelatives present

4,215

1.6

1.6

100.0

Total

262,516

100.0

100.0

149,130

I got the same results using the PUMS data (SPORDER = 1, NPF > 1, HHTYPE 1,2,6,7,10, or 11).

Both FAMSIZE and NPF appear to be a count of family members. But some records in the IPUMS data with a FAMSIZE of two are cohabiting couples. In the PUMS data, these records have a NULL value in the NPF variable. It might be helpful if IPUMS could explain somewhere in the FAMSIZE information that cohabiting couples are being counted as families.

Thanks for the 2021 ACS subject definitions link. It does explain difference between families and family households. It says the size of a family household may be bigger than a family because other non-family members may be present. However, it also says that the number of families is the same as the number of family households, so that shouldn’t have made a difference in my family count.

Whenever I work with the IPUMS data, I like to do a quick check of the dataset to see that I can match the published ACS data. Depending on what I’m doing, I check population, households, median income, etc. as a way to verify I have all of the records I need. I would be good if IPUMS identify the set of variables that will produce the same family count that ACS publishes for each data series.

Thanks again for your help with this.

Jack

~WRD0004.jpg

Thanks for the update. The IPUMS USA plans to review the family-related variables and revise our documentation to more clearly differentiate between those based on IPUMS-derived families versus those from the Census Bureau; I don’t have a definitive timeline for this work.

Kari,

I’ll keep an eye out. In the meantime, thanks for the great work you all do,

Jack

~WRD0004.jpg

1 Like