Definitions for orig_area, shape_leng and shape_area on 1930-2000 Tract Tiger 2008 GIS Files

Can you provide an easy-to-understand definition for the columns orig_are, shape_leng, and shape_area on the GIS files for the 1930-2000 Tract (Tiger 2008).

Can you also provide a definition for area_uncli?

The problem that I am having is that for each State - County I get multiple Tract numbers (see table below) and the only columns that could differentiate between them are those previously mentioned. But I do not know what they mean.
`
decade | statenam | nhgisnam | tract | orig_area | shape_leng | shape_area
1930 | California | Los Angeles | 100 | 59888907.27 | 42526.95182 | 59808492.52
1930 | California | Los Angeles | 200 | 42863501.55 | 31753.07723 | 42826617.37

What I eventually want to do with this information is to identify a city or a county subdivision, or latitude/longitude coordinates, or zipcodes. Is that even possible?

I read the answer on a previous topic about the unit of shape_area and shape_leng, but can you be more specific? Can I use the center point of the Shape Area to get latitude/longitude? If that is possible, Do you know a particular software that does that kind of transformation (can it be done with PostGIS)?

I also noticed that tract numbers are reused for each State - County pair (100, 200, etc), and the shape_area, orig_are, and shape_leng are the same for each Tract. That tells me that most likely I will not be able to get the lat/lon from those columns, but if that is true, perhaps do you know another way to get lat/lon, or county subdivisions, or zipcodes for 1930-2000 Tracts.

Any help on this would be greatly appreciated. Thank you!

Hi Ivan,

Based on your question, it seems as if you’re trying to compute the X and Y coordinates for the centroid of each census tract in the 1930-2000 shapefiles. Then, you want to use those coordinates (which you will convert to points) to determine the county subdivision/ZIP code tabulation area for each census tract (through some spatial overlay operation).

If you load the census tracts into PostGIS, I believe you can use the ST_Centroid function to determine the X and Y coordinates for the center of tract. Then, if you download the county subdivision or ZIP code tabulation areas from NHGIS (we have county subdivisions for 1980 - present and ZIP code tabulations areas from 2000 - present), you can overlay the X and Y coordinates to determine the county subdivision or ZIP code tabulation area for each tract. I don’t use Postgis, but I assume there is a spatial overlay that will accomplish that goal.

If you download the county subdivision or ZIP code tabulation area shapefiles from NHGIS, they are all in the same coordinate system (USA Albers Equal Area Contiguous). Thus, you won’t need to convert the features to a geographic coordinate system to get the latitude/longitude for the X and Y coordinates. If you need the latitude/longitude coordinates, you will need to use the ST_Transform function to convert the census tracts (and county subdivisions and ZIP code tabulation areas) to latitude/longitude coordinates. Then, you can use the ST_Centroid function to calculate the X and Y coordinates for the centroids in latitude/longitude.

With respect to the attributes you’ve listed, they mean:

shape_leng - the length of the perimeter of the census tract, in meters (computed after we erase the coastal water from census tracts)

shape_area - the area of the census tract, in square meters (computed after we erase the coastal water from census tracts)

orig_area - the area of the census tract, in square meters, of the unclipped polygon (before we erase out the coastal water)

area_uncli is supposed to be the same as the shape_area, but it looks like that isn’t always the case. I don’t know why that’s the case!

I hope this helps answer your question!

Dave Van Riper
IPUMS NHGIS

1 Like

Dave,

Thank you very much for the fast reply and for the clear description of the process. I will take a look at the centroids, convert them to points, and use the ST_Centroid function to find the center of the tracts. Then I will use those coordinates to get the Latitude and Longitudes.

I am not sure if it is a good idea to attach zip codes to the tract and county data since the zip code system was implemented in 1963 and I do not know if county subdivisions have changed (substantially) between 1930 and 1980. Since you know how the tract and county data has been created, does it make sense to use together more recent zip codes (2000) and county subdivisions (1980) with “older data” (1930-1980), or that is just an aberration that could possibly result in faulty data?

Thank you again, this was very helpful!

Best!

Ivan

Hi Ivan,

I would need to know a little bit more about what you’re trying to do with the older tract/county data and the more recent ZIP codes and county subdivisions to answer your question. Are you trying to aggregate census tracts together from 1930-1980 to approximate ZIP codes or county subdivision?

If you can provide more context, I can provide a more concrete answer!

Dave

Dear Dave,

Thank you for the time invested in these questions.

I am trying to use this data for research purposes. I have historical addresses from the research subjects. For each historical address, I need to identify the Tract number, Latitude, Longitude, and zip code The addresses are for different years and decades, some places do not even exist anymore. What we want is for each address to get similar information to the provided by the IPUMS Geomarker, but the tool needs to run locally (PC or local server, perhaps ArcGIS Desktop or PostGIS) due to the need to protect personal information and with historical data.

I looked for the NHGIS Tract Centroid but was not able to find it on neither of the two versions (Tiger 2000 and 2008). Those are only available for the counties (at least in the 1930 GIS files).

Thank you!

Ivan

Dear Ivan,

Thanks for the added details!

We don’t include the centroid for census tracts in the shapefiles. You can compute the centroids using ArcGIS Desktop (using the AddXY function or something like that) or PostGIS (using the ST_Centroid function).

Based on your description of your research project, your goal is to geocode the historical addresses, which will provide you with a latitude/longitude for each participant. Once you have the lat/long for each address, you can then overlay the lat/long with a census tract (or county subdivision or county) shapefile to assign it the attributes from the census tract (or county subdivision or county).

I think you are right to worry about changes to county subdivisions over time - they aren’t particularly stable, especially in southern and western states. We don’t have any resources, however, to help you track those changes over time.

You’ll need street data to support the geocoding on your local machine. You can use Census TIGER/Line files for that if you don’t have another source.

Good luck!

Dave

1 Like