RACESING for CITY in 1960 missing?

Hello! I’m trying to compare the share of people who identified as Black in different cities at 10-year intervals going back as far as I can. My plan is to use SDA: the CITY variable for rows; the RACESING and RACHSING variables for columns; and the YEAR variable for control.

  1. Is there a better way to answer my question? I’ve read the IPUMS comparability documentation on CITY and there’s a lot going on — I’m not entirely sure if I should exclude certain cities or if there are some that aren’t comparable across decennial counts. And will the results be missing any major cities?

  2. The availability tabs for CITY and RACESING indicate that those variables are present for 1960. However when I run that query, I get no records. Any tips?


(3. Relatedly, any guidance for how to fill in the blanks for 1970? The documentation notes that the CITY variable isn’t available for that year and the query does indeed return no records.)

For research on aggregate city-level demographics, my recommendation is to access time series data on IPUMS NHGIS rather than microdata on IPUMS USA. Census microdata consistently identifies the city of residence for households only until 1960. In that and all following years, except for 1980, the city of residence is not provided. Instead, other variables need to be used to identify, and sometimes guess, the city of residence for respondent households. In 1990-onwards, the data provides the Public Use Microdata Area (PUMA) that a particular household resided in. Sometimes the PUMA is entirely within the city boundaries, but often it’s only partly within a city. CITY is IPUMS’s attempt to find a solution for researchers by reporting the city in which the majority of each PUMA’s population resided. However, a household might not in fact have resided in its identified city. To determine if your results are missing major cities, you should look at the availability tab for CITY over time.

IPUMS NHGIS avoides these issues by reporting aggregate statistics on counties and places, providing time series tables on race going back to 1970 on the place-level. You can find these by going to the Data Finder tool, selecting Place as your Geographic Level and Race as your Topic. The time series tables are located on the second tab. You can download tables using either nominal data (i.e. using contemporaneous place boundaries) or standardized data (i.e. using place boundaries standardized to a particular year). These can then be supplemented by microdata for prior years when identification of cities is much more practical. Note that while geographic detail is a benefit of aggregated data on NHGIS, there is also a limit to the variables that tables are available for. RACESING is an example of an IPUMS variable for which NHGIS tables are unavailable. That means that when comparing racial identity over time, researchers should keep in mind that respondents in years prior to 2000 were not given the option to identify with more than one race.

Regarding your question on the 1960 sample, the thing to note is that CITY is only available in the 1960 5% sample. This sample can be analyzed separately from the United States, 1850-2021 1% samples by selecting 1960 Census 5% from the Online Data Analysis System page. In the 1970 samples, the smallest geographic identifier available (in the 1970 1% form 1 and form 2 metro samples) is METAREA. Note that due to confidentiality requirements, in 1970 cities were identified in METAREA only if they contained 250,000+ inhabitants.