Why is household income level in 1970 so high?

Hello all,

I am trying to see the real household income (by aggregating INCWAGE) from 1960 to 2010.
But the mean household income in 1970 is extremely high (69612.63), compared to 1980 (44789.66), 1990 (48879.25), 2000 (54244.7), and 2010 (51436.05).

Here are what I did in Stata:

  1. drop if gq>2;
  2. replace the INCWAGE with value 999998/9 to empty cell;
  3. generate household income by summing up INCWAGE by SERIAL;
  4. get real household income using CPI99;
  5. keeping only one observation per household and drop the households with zero income;
  6. get weighted average of the real household income.
drop if gq>2
replace incwage=. if incwage>=999998
bysort serial: egen household_income=sum(incwage)
gen real_hhinc=household_income*cpi99
drop if real_hhinc<=0
keep if pernum==1
sum real_hhinc [aw=hhwt]

I am relatively new to the data set so I am not sure if I did something wrong or missed something.
Thank you!

I believe that @JeffBloem has already replied directly to you, but I’ll share his response here as well:

It looks like you might be using two of the 1970 samples, when (with the IPUMS provided sampling weights) you should really only be using one at a time. If this is the case, your estimate for 1970 is twice as large as it should be. So, you can either simply use one 1970 sample or use two 1970 samples and divide the sampling weight in 1970 by 2. In either case, this should give you estimates that make much more sense in your time series.

Let us know if you have any other questions!

Thank you Michelle for sharing Jeff’s response!
I got his email today. Thank you for your help ^^