County-Level Voter Turnout

I’m trying to figure out how to calculate black-to-white voter turnout at the county level. Can I aggregate the individual-level voter supplement extract data to the county level? Or is there a better way to calculate this county-level variable? I’ve seen it used in several studies, but I’m not sure of the best way to actually calculate and use it myself.

It is possible to calculate these turnout ratios using CPS data. However, CPS statistics are not very robust at the county level, especially with a single month of data. I’d recommend looking at studies and seeing the sources they use. This site has helpful guidance on how to estimate voting turnout rates. Some sources you may be interested in are official data from state governments such as this state data from Minnesota.

If you do decide to use CPS data, I would strongly recommend starting by reviewing the Voting and Registration Supplement sample notes so that you’re aware of the peculiarities of this sample. You will have to calculate these county-level ratios using the individual-level data that’s provided in the voter supplement. To find the total black and white population for each county you will want to sum VOSUPPWT for each value of RACE and COUNTY. To specifically find the population that is eligible to vote (i.e. persons age 18+ who are US citizens, including individuals who are native born, born abroad of American parents, or naturalized citizens), you will want to sum the weights only of those who are in the universe of VOTED (i.e. do not have a code of 99). You can then use VOTED again to sum the weights of everyone who answered that they voted by RACE and COUNTY. This will let you derive a voter turnout statistic by race which you can then use to create a black-to-white turnout ratio.

Also of note is that not all county codes are identified in the CPS. In fact, only about 45 percent of households in recent years are located in a county that is identified. This is important if in addition to calculating the voter turnout ratio you also want to know the names and locations of the counties in your sample.

Hello, I wanted to ask a follow-up to this question here –

I am tackling a similar problem where I want to look at education and income (instead of race) for voters people. I want to be as granular as possible, while achieving complete coverage of the sample. My current idea is that I can aggregate at the county level where available and otherwise fall back to the MSA. This gives me about 73% coverage. If I additionally use “STATEFIP”, “COUNTY”, “METFIPS”, “METRO”, “INDIVIDCC” I can ensure 100% coverage and I am breaking the country into 757 groups which I can map, although some of these geographic regions are certainly a little odd.

Is there a better way to do this? Are there any geographic variables I am missing?

Finally, in an ideal world I only need the data at the county level – I assume there is no way to provide just that aggregate without any other microdata? If it is helpful I only really need college/non-college for education and quintiles for income!

Any guidance here would be hugely appreciated!

The approach you describe seems like the best option for identifying the most granular geography for each person in the CPS Voter Supplement. I’m finding that about 40% of respondents in the November 2022 supplement resided in one of 556 counties that are large enough to be identified on their own. Since metro areas are composed of one or more counties, there are many cases where the county of residence is not identified, but the metro area is. As an example, of those residing in unidentified counties in the November 2022 supplement, about 60% resided in one of 151 identified METFIPS metro areas.

The remainder of the population which resided in neither an identified county or metro area can be identified as residing in one of the 50 states. These respondents will typically be located in small rural counties in their state, however they can also include residents of smaller unidentified metro areas as well. You might use the variable METRO to further divide respondents within a state between metro and non-metro residents (i.e., METRO = 1 vs METRO = 2, 3, 4). Note that METRO = 0 (not identified) groups respondents both in unidentified metro areas and in non-metro areas; as a result, their metro status is indeterminate. INDIVIDCC is unlikely to give you much additional information since in most cases if a county is not identified then the individual city is also not identified.

With all of these tools in mind, you should be aware that the resulting small sample sizes for these granular geographical areas may add significant sampling error to your estimates. The Census Bureau specifically states that estimates for individual metropolitan areas “should be treated with caution. Although estimates for the larger areas such as New York, Los Angeles, and so forth, should be fairly accurate and valid for a multitude of uses, estimates for the smaller metropolitan areas (those with populations under 500,000) should be used with caution because of the relatively large sampling variability associated with these estimates.” Similar precaution should be taken with county-level estimates as well. One method you can use to be more informed about the sampling variability of your estimates is to calculate the standard error of your estimates by using the CPS supplement replicate weights. While you will need to download these separately since we currently do not provide them on IPUMS CPS, you can link the replicate weight observations with an IPUMS CPS extract using the linking keys HRHHID, HRHHID2, and LINENO (referred to as PULINENO in the CB data). Replicate weights allow a single sample to simulate a range of possible alternative samples, generating more informed standard error estimates that mimic the theoretical basis of standard errors while retaining all information about the complex sample design. See our user guide with sample code for running analyses using replicate weights.

As far as I’m aware, there are no county-level summary statistics published by the Census Bureau for the CPS Voter supplement. However, you can obtain education and income summary statistics on the county level through our spatial data project, IPUMS NHGIS. We recommend that new users of IPUMS NHGIS review the short video tutorials on the user guide as well as relevant portions of the FAQ page to familiarize themselves with the website.

1 Like