Hi folks. Documentation for COSTELEC is quite thorough, but doesn’t seem to include any information about how ACS ELEP numbers (“LAST MONTH, what was the cost of electricity for this house, apartment, or mobile home?”) are annualized to COSTELEC. Same question for the annualization of GASP to COSTGAS.
For context, we’ve been able to match COSTWATR and COSTFUEL household-weighted means precisely to Census aggregate tables for WATP and FULP, respectively, but the corresponding annualized values for COSTELEC and COSTGAS do not bear any resemblance to ELEP * 12 or GASP * 12, for example.
Extra credit: we’re also looking for any documentation of whether and how the Census might adjust “last month” values of ELEP and GASP pre-release to data.census.gov and to IPUMS. Is there any attempt to wash out seasonality or within-year inflation prior to release? If so, what’s the method?
If no one here knows or cares, rest assured that we will doggedly pursue this until we can answer. I’ll be back!
COSTELEC is calculated by multiplying the monthly values reported in the ACS Public Use Microdata Sample (PUMS) by 12 with two exceptions: (1) variable-specific specialty/missing codes 9993 (“No charge or no electricity used”) and 9997 (“Electricity included in rent or in condo fee”) where no monthly electricity costs are provided, and (2) top coding.
To preserve respondent confidentiality, the Census Bureau edits the top 0.5% of reported values in each state/year pair and reallocates them to the median value in this top-coded group. For example, the 2023 original public use microdata sample from the Census Bureau sets the threshold for top coding of monthly electricity costs in Alaska at $1,000 and then assigns all households exceeding this cost a value of $2,400 (the group’s median). However, the maximum top code for IPUMS COSTELEC is currently fixed at $9,990; meaning that any state/sample-specific top code for monthly electricity expenditures that exceeds 9,990 when multiplied by 12 will be recoded to 9,990 by COSTELEC despite this being lower than the top-code reported in the original data. I have a message out to my colleagues to have this issue addressed. However, this only affects roughly the top 1% of values in the 2023 ACS. Note that COSTGAS also has the same issue of a fixed top code being enforced by the IPUMS recoding protocol.
It’s possible to run a direct comparison with published monthly figures using the unedited PUMS monthly value that we provide in the source variable US2023A_ELEP. I’m unsure which Census aggregate tables you are comparing your estimates to, but I was able to find this one that provides estimates of the number of households across six brackets of monthly electricity expenditures. Below, I’ve shared my estimates using IPUMS data for each of these brackets (weighting my analysis with HHWT and restricting to a single observation per household using PERNUM == 1). I drop households with the variable-specific codes 9993 and 9997. These estimates are overall very close to the figures reported in the table (~2% off) and are identical whether I estimate using the monthly source variable or COSTELEC/12 (because the highest interval of the monthly expenditures is 250 or more, and 250 is below the IPUMS-imposed top code of 9990, in this specific case the two are equivalent). While these estimates are not within the published margin of error, they certainly bear much resemblance. I suspect that any remaining difference is due to the fact that the table is produced using the internal restricted data file which includes the entire ACS sample (the PUMS includes about 2/3 of the entire sample) as well as unrounded values (the PUMS rounds values to the nearest $10).
Less than $50: 6,081,040
$50 to $99: 24,265,416
$100 to $149: 27,716,062
$150 to $199: 21,306,611
$200 to $249: 16,104,809
$250 or more: 29,737,541
Aside from the documentation in the editing procedure tab for COSTELEC, I am not aware of any other adjustments such as those for seasonality. The Census Bureau provides a within-year inflation factor that researchers may use, but the decision to use this factor is left to the discretion of each researcher.
I have a related question.
I am trying to calculate the monthly gross rent variable for my project (vacant rental units). But in some cases, especially for top coded COSTELEC and COSTGAS situations, my calculated gross rent is different than IPUMS’ RENTGRS.
Aren’t we supposed to divide by 12 to get the monthly cost? If I divide 9990 by 12 it is $832.5. But I am finding (in Tennessee) $3,100 is added for monthly electricity cost, which is the top code value for original PUMS data in TN.
Am I doing something wrong?
Thank you
You are correct that IPUMS USA currently imposes a maximum value for COSTELEC of $9,990. This limit carries forward assumptions about the maximum possible value for this variable based on previous samples. However, the current maximum value unfortunately means that any annualized costs that exceed this threshold are truncated and reported as $9,990 in COSTELEC. Using the example you provided where the original top coded value for monthly costs is $3,100, when multiplied by 12 the annual value ($37,200) exceeds the currently allowed maximum value, and is instead reported as $9,990. My IPUMS USA colleagues are aware of the issue and are working to modify this maximum value to avoid censoring these types of high values.
In the meantime, you can obtain the original monthly electricity costs by adding source variables for COSTELEC to your data extract. These provide the reported values with the PUMS top codes, but before they are multiplied by 12 or cut-off at $9,990. Since each sample will have a different corresponding source variable, you will want to first add all of your samples of interest to your data cart. The source variables tab will then display the variable for each of your samples so that you can more easily add them to your data cart. Note that the source variables use the placeholder value “BBBB” to denote variable specific categories including cases that are not-in-universe.
More broadly, I want to make sure that you are aware that vacant rental units do not have data for rental or utility costs in the ACS. While you can request a household records only extract from the Extract Request page (right before submitting the extract for processing) that includes these types of vacant units, you will find that all households with VACANCY = 4 (“For seasonal, recreational or other occasional use”) are not-in-universe for these variables. You might take a look at the American Housing Survey for data on these types of vacant units.