Historical central (principal) city data

Yeonhwa_Lee · April 2, 2024, 8:42pm

Hello,

I would like to be able to identify tracts that belonged to central cities. For more recent years (e.g., 1999, 2009, and on), it appears that I can take the OMB’s list of central / principal cities, use the place fips codes to identify their polygons in the Places shapefile, and find the intersecting tracts.

Is there anything across NHGIS and IPUMS USA (METRO variable?) that might similarly enable me to attain central city boundaries and/or tracts that belong to central cities for earlier years (1940 - )?

Thank you very much.

MPC_vanriper · April 5, 2024, 2:12am

Dear Yeonhwa,

This is actually more complicated than it seems, and it’s going to require some research to track down each decade’s central cities. You’ll also have to do some custom programming to identify the census tracts in central cities.

The concept of the central city was first introduced in 1950, when standard metropolitan areas (SMAs) were first developed by the Census Bureau. I found a PDF listing central cities in 1960 and 1950, and I recommend looking in it for a list of central cities used for those two decades.

Then, once you’ve deteremined the central cities for those decades, you should download census tract CSVs from NHGIS. Then, you can use the values in the AREANAME to identify the tracts that fall within central cities. For example, two census tracts in Alameda County, CA, in 1950 were

STCTY-06001 TRACT- 0072 IN OAKLAND CA
STCTY-06001 TRACT- AC 0001 IN ALBANY CA

The first record is in Oakland, which was a central city in 1950. The second record is also in Alameda county, but was in Albany, which was not a central city in 1950. You’ll have to do some string parsing to flag tracts in the central cities.

In 1960, you can adopt the same strategy. I will also point out that the 1960 tract file has a PLACE field containing codes for cities. I would bet that most tracts with a value in the PLACE field are likely to be inside central cities.

In 1970, you can get a list of central cities in NHGIS in the 1970_Cnt4Pa. Central cities are a compound geographic level in 1970. You can also central cities for each standard metropolitan statistical area in this PDF. Unfortunately, the 1970 tract data do not have a city code on them, which would facilitate their identification as central city tracts. By 1970, census tracts do not always nest within cities, so there is not as clean a way to identify tracts within central cities.

In 1980, you will need a compound geographic level to identify tracts within central cities:

Census Tract/Block Numbering Area (by State--Standard Metropolitan Statistical Area--County--County Subdivision--Place) - tract_02098

This geographic level is available in the 1980_STF1 dataset. If you download this file, you will want to filter for census tracts with a value of PLACEDESC == 3. I believe these represent tracts within central cities of SMSAs.

In 1990, you will need a compound geographic level to identify tracts within central cities:

Census Tract (by State--County--County Subdivision--Place/Remainder) - tract_080

This geographic level is available in the 1990_STF1 dataset. If you download this file, you will want to filter for census tracts with a value of PLACEDESC == 3. I believe these represent tracts within central cities of SMSAs. You can also get a list of central cities from the following geographic level:

Central City (by Metropolitan Statistical Area/Consolidated Metropolitan Statistical Area--State) - place_382

I recommend cross-checking the tracts with PLACEDESC == 3 against the list of central cities in the place_382 file.

I think the 1970 identification will be the most difficult since the geographic detail in the summary files is less than in later years.

Sincerely,
Dave Van Riper
IPUMS

Yeonhwa_Lee · May 16, 2024, 1:32pm

Dear Dave,

Thank you very much for your thorough answer. I have been implementing your solutions and have successfully identified central city tracts for 1960.

While working on 1980, I ran into an issue of tract IDs. The GISJOIN column for the compound geographic level (tract_02098) has between 19 and 25 characters. I tried to reconstruct a 9- or 11-digit tract GEOIDs by extracting 2 digits for state, 3 digits for county, and then 4 or 6 digits for tract. However, the TRACTA column contains between 1 and 6 characters. My guess is that leading zeros have been dropped, but I wanted to confirm. About 15% of the resulting 11-digit GEOIDs (adding 00’s for when a tract has 4 digits) do not appear in the Brown University’s Longitudinal Tract Data Base (LTDB) crosswalk for 1980-2010, so I wanted to make sure that my inferences about the GEOIDs were correct first before proceeding to investigate the discrepancy further.

Thank you.

Best regards,
Yeonhwa

MPC_vanriper · May 16, 2024, 2:27pm

Dear Yeonhwa,

I looked at the GISJOIN and TRACTA fields for tract_02098 in 1980. First, the TRACTA field in the raw data file (which I inspected via VisualStudio Code) is a quoted string with leading zeroes. I know some software packages aggressively drop leading zeroes when reading data, so it’s possible that they were dropped when you opened the file. If you can retain those leading zeroes, you should be able to construct a reasonable GEOID value.

With respect to the GISJOIN lengths, I find three different lengths - 19, 21, 23, and 25 characters long. The 19 and 21 character long GISJOINs contain state, county, county subdivision, and tract codes (the 19 character version has a 4 digit tract code and the 21 character version has a 6 digit tract code). They are missing the place code because PLACEA is null. This happens in states in New England because places don’t really exist in those states.

The 23 and 25 character long GISJOINS contain state, county, county subdivision, place and tract codes (the 23 character version has a 4 digit tract code and the 25 character version has a 6 digit tract code).

You could write custom handling for the each length of GISJOIN, but I would start by trying to retain leading zeroes in TRACTA, and then concatenating state and county codes.

I also want to notes that you need geographic level tract_02498 to get tracts in states without functional county subdivisions. Geographic level tract_02098 contains tracts in states with functional county subdivisions.

Sincerely,
Dave Van Riper

Topic		Replies	Views
Definitions for orig_area, shape_leng and shape_area on 1930-2000 Tract Tiger 2008 GIS Files NHGIS	5	1611	March 17, 2021
Measuring Distance from Central City for Tracts NHGIS	5	2504	March 22, 2019
Tract-level or neighborhood-level data on Race for 1940/1950 USA	0	378	April 22, 2016
Census tract for a City 2000 and 2010 NHGIS	1	9	August 7, 2024
Can census tracts from 1980, 1990, and 2000 be linked? NHGIS	3	2454	April 16, 2024

Historical central (principal) city data

Related topics