Key identifier for household and personal data

Hi there -

Curious how I can merge household data to personal level data? Is there any key identifier that I can use to merge these two groups? I am interested in geographic & race information, but geographic level info is available at the household level only while race information is available at the personal level only. Thank you.

The default format for data extracts from IPUMS USA is rectanguarlized at the person-level, meaning any household-level variables you include in your extract will be appended to the relevant person-records. You could also request data in a hierarchical format (where person records are nested within households) and append the household-level information yourself using syntax in a statistical package, but requesting a rectangularized file is much simpler. You can check the data structure of your extract by looking at the information on the “Structure” line of the extract description.

If you select “Change” to the right of the “Structure” line, you select a different data structure.

Hi Kari,

Thank you for your response. I received the restricted 100% census data from the IPUMPs recently, so I cannot control the structure of the data. Do you know how to connect personal information to household-level information in the restricted version data?

The unique household identifier (SERIAL) is available on both the household and person records. You can use this with the appropriate syntax for your statistical analysis software to rectangularize your data and append the household-level variables onto the person records. You may need to combine SERIAL with YEAR if you have multiple full-count censuses in your data file. If you have questions that are specific to the restricted use full-count data, I encourage you to follow up directly with that team by email.

Got it.

I have a general question about the data set. Is the Enumeration district consistent over time, meaning that it’s similar to the census block that we used in old days? Can I assume that the Emeration district ‘1’ in 1930 is the same location (geographic group) as the generation district ‘1’ in 1940?

Also, the variable “Street” provides a full street address or just a street name?

Thank you for your responses!

Hi Jamie,

Enumeration districts are not consistent over time, so I wouldn’t assume that ED 1 in 1930 is the same as ED 1 in 1940. The EDs were established to facilitate enumeration of households, and that was highly dependent on the structure of the locale. They were established before each Census, and they did not (as far as I know) use prior definitions.

Dave Van Riper

As noted on the variable description tab for STREET, this variable reports the first 32 characters of the street address as written on the original census form.