Hello @MPC_vanriper and others, I am re-upping a prior thread on identifying rural counties in CPS as I am similarly trying to identify all CPS-ASEC respondents in non-metro areas (not counties) in a given state to proxy for rural status. I had originally used METRO = 1 (“not in metropolitan area”) under the assumption this would be the best proxy for rural, but I have found that certain states have trend-breaks in what is coded as METRO = 0 versus METRO = 1.
E.g., in Colorado, there are no respondents with METRO = 0 until 2005/2006, when there are suddenly many respondents with METRO = 0 and none where METRO = 1 (please see screenshot below for raw N from the online analysis tool). This reverses in 2015/2016. This is the case in other states as well (DE, LA, others), although these trend breaks are at different years.
My question is whether these individuals are incorrectly being coded as unidentified (METRO = 0) when they should be “not in metro area” (METRO = 1) in all sample years, or is there something else going on here that I am missing? Thanks very much in advance for your help.
IPUMS assigns households to METRO = 0 in cases when the Census Bureau does not disclose whether a household resided in a metro or non-metro area (i.e., cases where the unharmonized variable UH_METSTAT_A1 or UH_METSTAT_B1 = 3). These households therefore cannot be categorized either as residing in a metro area or non-metro area and may include both types of households.
In order to protect respondent confidentiality, the Census Bureau uses internal population thresholds to determine which metro areas (METFIPS) can be identified in the data and which are too small to do so. In the case of Colorado, all six metro areas based on the June 30, 1993, MSAs are identified in the ASEC from 1995-2004. From 2005-2014, the ASEC used the June 2003 MSAs (now called CBSAs) delineated by the Office of Management and Budget. The set of Colorado MSAs in the June 2003 delineation includes a single metro area (Grand Junction) that did not meet the threshold for identification in METFIPS. It additionally appears to not have met the threshold for identification for METRO; by being the only unidentified metro in the state, households in this metro could not have been coded into METRO = 2, 3, or 4 since it would be trivial to identify which metro these respondents came from. Moreover, assigning these respondents to METRO = 0 as the only households in this category would also not address this concern. The METRO = 0 category needed to be supplemented by respondents not in metro areas, which is why we cannot identify respondents from non-metro areas in the state from 2006-2014.
From 2015-present, the ASEC does not identify two metro areas in Colorado: Grand Junction and Pueblo. However, there are no respondents with METRO = 0 and all respondents with an unidentified METFIPS (99998) are coded with METRO = 1. It appears therefore that neither of these metro areas were sampled from 2015-onwards. You can read more about the CPS sample design in Chapter 2-2 of the Design and Methodology Technical Paper 77. In particular, the paper notes that while each MSA forms its own primary sampling unit (PSU), not every PSU is selected for inclusion in the sample. Note that since there are no respondents with METRO = 0, there is no need to supplement this category with respondents with METRO = 1 and non-metro respondents can be identified in this period.
You’ll notice that the transition between these periods is not perfect; there are observations across all METRO categories in 2005 and 2015. These respondents are from the ASEC Hispanic oversample and are sampled separately using the preceding period’s sampling procedure.
1 Like