Reproducing SPMFTOTVAL

As a part of my research, I need to decompose the income measures that comprise an SPM unit’s cash income (SPMFTOTVAL) into its component parts. I understand that SPMFTOTVAL is simply the sum of INCTOT for each member of an SPM unit, and therefore, SPMFTOTVAL is also the sum of INCTOT’s component incomes, which are listed for various years here. In theory, I should be able to reproduce SPMFTOTVAL by summing the income components that make up INCTOT in a given year for each SPM unit (identifiable using SPMFAMUNIT).

In practice, I am able to do this for most years. However, I am having trouble reproducing SPMFTOTVAL using the 2018 and 2019 CPS ASEC. I made sure to remove NIU values from all the component income variables as suggested previously in a similar inquiry here.

In Stata, my code for aggregating these incomes in 2019 at the SPM unit level is as follows:

egen row_income = rowtotal(incwage incbus incfarm incss incwelfr incretir incssi ///
incint incunemp incwkcom incvet incsurv incdisab incdivid incrent ///
inceduc incchild incasist incother incrann incpens)

bysort year spmfamunit: egen spm_cash_cps = total(row_income)
// Reconstructed SPMFTOTVAL = spm_cash_cps

In 2018, my code differs slightly from what I’ve written above because I don’t include INCRANN or INCPENS into the estimates.

Comparing means for SPMFTOTVAL and my reconstructed variable SPM_CASH_CPS above for years 2018 and 2019, I find the means of each variable differ by several hundred thousand.

Taking just a couple examples from the 2019 data:

  • The family unit with SPMFAMUNIT = 25834001 possesses an SPMFTOTVAL = 133212, but my calculations using the code above find the sum should be 266424. This is of course odd because my estimate is exactly double the SPMFTOTVAL.

  • The family unit with SPMFAMUNIT = 25824001 possesses an SPMFTOTVAL = 158576, but my calculations using the code above find the sum should be 148576.

For other data years, 2010 - 2017, I don’t have any problems recreating SPMFTOTVAL using the procedure detailed above. Curious if I may be missing something for these data years, or if I might be going about things the wrong way.

Thanks for your inquiry. We are looking into this, but we likely won’t have a response until sometime next week.

I was not able to replicate your results; when I replicated SPMFTOTVAL manually by summing component variables (and replacing NIU codes with 0) I got a nearly perfect match (but for a handful of cases that differ by $1 and I assume are rounding errors). However, in the process of trying to replicate SPMFTOTVAL, I did note that the NIU codes for the component variables were cumbersome to track down. My best guess is that your code reassigning NIUs had a minor typo (e.g., a 9999999 instead of a 999999). I have verified that NIU codes are listed correctly on each variable’s webpage and they are consistent within variable across these years, but that the NIU codes differ across the INC* variables.

Since I already tracked them down, here are the current NIU codes (in Stata syntax) that need to be handled for these variables (for variables available in 2018-2019):

recode incchild inceduc incss incssi incunemp incwelfr incwkcom (999999=0)
recode incasist incdisab incint incrent incsurv incvet incdivid incother (9999999=0)
recode incwage inclongj incbus incfarm incretir (99999999=0)