I downloaded ACS 5 year data for 2005 to 2018 (the earliest data being the 2009 5yr and the latest being 2018 5yr). For the 50 largest cities (in terms of population size) I want to get city-level averages for person-level stats (like labor force participation for mothers with children). How do I link the PUMAs to cities?
The variable CITY does identify some cities, but in order to identify the top 50 by population you will need to isolate cities by combining PUMAs (there is a PUMA to city crosswalk on the description tab for CITY) and/or through a combination of METRO and MET2013. METRO identifies whether a household is within a metropolitan area and whether it is part of the principal city; MET2013 indicates which metropolitan area the household belongs to. Crosswalk spreadsheets are available at the bottom of the description tab for MET2013 that indicate the specific PUMAs included in each metropolitan area. It is important to note that not all PUMA codes are unique across all states; you will need to include both the PUMA code and STATEFIP code in your analysis to identify unique cities. Based on your description of the samples you are including, I don’t think this is a problem, but I like to remind researchers that their 5-year ACS samples should not overlap–otherwise you are double-counting observations. Assuming you have included the 2009, 2014, and 2018 5-year files only, this shouldn’t be an issue.
You might also consider whether the data you are looking for is available through NHGIS summary tables. I did a quick search of 2014-2018 data at the City level with the topic ‘Labor Force and Employment Status’ and found what looks like a relevant table, “Women 16 to 50 years who had a birth in the past 12 months by marital status and labor force status.” These tables can be linked to GIS files provided by NHGIS.
Hello, i would like to know what does ‘not in identifiable citiy’ (for the variable city) mean ? I am using data from 2012 to 2019.
As discussed in the Comparability section of the CITY variable description:
In source PUMS files since 1990, the most detailed geographic information available is for Public Use Microdata Areas (PUMAs), which are designed to contain at least 100,000 residents each. … PUMAs are sometimes coterminous with city boundaries, but they also frequently encompass multiple cities and occasionally straddle city boundaries. Therefore, for most cities, and even for some very large cities, it is impossible to identify the exact set of corresponding PUMS records.
Additional text on that page describes the protocols IPUMS USA uses to determine which cities are identifiable. There are also links to crosswalks between PUMAs and cities and “match summaries” that provide info about how closely PUMAs match up with each large U.S. city.