I am trying to identify respondents that live in cities versus suburbs using the ACS. The METRO variable’s distinction between in versus outside the principal city within a metro area looks promising. However, I am surprised to find such a large number of households (~30-40% depending on the year) who are coded METRO = 4 (Central / Principal city status unkown). I take these to mean households that are in the metro area, but who may or may not be living in the principal city.
Could you explain the process for identifying whether a respondent is in or outside of the principal city, and based on that method, whether these respondents should fall under “city” or “suburbs”, or either?
And if the METRO variable isn’t enough to identify “city” versus “suburbs”, is there another variable or method that you can recommend?
Your interpretation is correct. A coding of METRO=4 means the respondent’s household resides within the metro area, but may or may not be located within the principal city. For a full explanation of why the exact METRO status of such a large percentage of respondents cannot be identified, I recommend reading this User’s Note on the difficulties of identifying metro areas. Basically, the lowest level of identifiable geography in the ACS is the PUMA. Since PUMAs must contain a minimum number of persons for confidentiality reasons, they often straddle geographic boundaries. This makes determining the location of a respondent’s household difficult, especially within geographies as small as a typical metro area.
As a result, working with the METAREA/MET2013 and METRO variables will lead to two sources of measurement error in your estimates. First, if you refer to the table in the User’s Note above, several MSAs themselves cannot be completely identified. For example, 17% of the Oklahoma City MSA population cannot be confidently identified as living within the MSA. Second, even if an MSA is completely, or almost completely, identified, it may still have a large percentage of residents with METRO=4. For example, the Chicago MSA is 99% identified in the ACS samples, but approximately 30% of residents are coded METRO=4. You cannot assume that incompletely identified geographical units are a representative sample of the complete geographical unit.
Bottom line, METRO is still the best available method to identify “city” versus “suburbs” given the confidentiality restrictions on identifiable geographies, but this method is subject to error. The accuracy of identifying metro areas and central cities varies quite drastically across individual MSAs; thus, the decision of whether or not using METRO leads to a tolerable level of error depends on the geography of interest and the researcher’s discretion.