Hello all,
I am new user with Zip data. I am curious about how 1990 and 2000 ZCTA data are standardized to 2010 ZCTA geography, since some ZIPs did not exist in earlier years. I am using 1990, 2000, 2010, and 2020 decennial census data, mainly on race and housing. I have several questions:
-
If I download ZCTA data for 1990, at the 2010 ZCTA geographies, do the ZCTAs that did not exist in 1990 just have no associated data? Or, does NHGIS use smaller 1990 geographies to interpolate 1990 numbers onto 2010 ZCTA polygons, therefore imposing 2010 ZCTA geographies for 1990 population counts?
-
Depending on the answer to #1, is it appropriate to say that the 1990 data for a specific ZCTA is the best estimate of the population within the areal polygon of that specific 2010 ZCTA, for the year 1990?
I am trying to determine if I need to gather 1990 block group data and try to aggregate it up to 2010 ZCTA polygons to investigate changes in population from 1990 to 2010, or if the 1990 and 2000 ZCTA counts are appropriate for this purpose.
Thank you for your efforts on this question!
Alex
- If I download ZCTA data for 1990, at the 2010 ZCTA geographies, do the ZCTAs that did not exist in 1990 just have no associated data? Or, does NHGIS use smaller 1990 geographies to interpolate 1990 numbers onto 2010 ZCTA polygons, therefore imposing 2010 ZCTA geographies for 1990 population counts?
If you download a time series table that’s “standardized to 2010,” then the 1990 data in the table are, as you say, based on “smaller 1990 geographies” and use 2010 ZCTA geographies. Specifically, NHGIS allocates data from 1990 blocks to 2010 ZCTAs. The section on “geographically standardized tables” in the time series table documentation provides an overview, with links to detailed methodology.
- Depending on the answer to #1, is it appropriate to say that the 1990 data for a specific ZCTA is the best estimate of the population within the areal polygon of that specific 2010 ZCTA, for the year 1990?
Yes… at least it’s our best estimate! In cases where a 1990 block straddles multiple 2010 ZCTAs, there are many ways to estimate how 1990 characteristics are distributed among the ZCTAs. We’ve done a lot of work to avoid the biggest problems, so I’d claim that our estimates are likely the “best available.”
I am trying to determine if I need to gather 1990 block group data and try to aggregate it up to 2010 ZCTA polygons to investigate changes in population from 1990 to 2010, or if the 1990 and 2000 ZCTA counts are appropriate for this purpose.
A couple thoughts about this:
- You shouldn’t need anything other than NHGIS time series tables if (a) your application allows you to use 2010 ZCTA definitions only, and (b) you only need basic data on race and housing, which are covered in NHGIS standardized time series. At this time, we haven’t yet produced standardized time series for other ZCTA definitions (e.g., 2020 ZCTAs) nor do we have any standardized data from “long-form” sources (e.g., income, education, employment status, etc.). If you need data for other ZCTA definitions or for characteristics that are covered only in long-form sources, then you may need to get block group data and standardize it yourself. NHGIS geographic crosswalks can help with that, but those don’t yet directly support allocation from block groups to ZCTAs.
- I’d also recommend that you consider (if you haven’t yet) using different units of analysis, such as block groups or census tracts, rather than ZCTAs. First, ZIP Codes pose a variety of challenges for spatial analysis. Second, if you do want to analyze any long-form characteristics, NHGIS crosswalks make it relatively easy to generate good standardized estimates for 2010 block groups or census tracts. That said, there are still cases where ZCTAs are the best option… I just think it’s good to be aware of the complications and alternatives.
@JonathanSchroeder, thank you for this insight (and for all your insight on this forum!)! You confirmed my suspicions - which is great, and saves me a great deal of slogging through GIS applications for my own allocations. I knew this was the case for standard census geographies, but as you pointed out, Zip/ZCTA geographies are a unique geography so I wanted to be sure the time series process applied here as well. I am hoping to be able to stick with basic census data on race and housing, but may dive into my own allocations if I determine other housing data is necessary.
On the point of using another geography, you are right, and I wish I could!. Zip data presents a few sticky issues both technical and theoretical. But, other data for this particular project demands the Zip/ZCTA geography, so for now I will go with option. Real estate data - at least nationally - is rarely identified by census geographies smaller than the county unless you are working with transaction level parcel data (which is horrendously expensive for a graduate student…). Thus far, I only anticipate needing the 2010 ZCTA geography so additional cross-walking shouldn’t be necessary at this time, but the crosswalk are a great resource that I will keep in mind.
Also, my hunch is that Zips, while arbitrary in their creation, do take on some social particularities and patterning over time, especially regarding real estate markets and the valuation of certain spaces over others. Hopefully this project can make some sense of of that…
Again, thanks for the time you put into this. What a great resource and community!