Analyzing Adjusted Household Income in ACS 2023

Hi,

I am trying to calculate adjusted household income in ACS 2023 (1 year survey), using the OECD equivalized scale. However, the results I obtained are unrealistically high, indicating that something is wrong with my calculation. For example, the adjusted annual household income (in U.S. dollars, 2023) for males aged 65 and over is $329,875, which is clearly incorrect.

My Stata formula is below:

sort serial age

by serial (age), sort: gen adult_order = .

by serial (age): replace adult_order = sum(age >= 14)

gen weight = .

replace weight = 0.3 if age < 14

replace weight = 1 if age >= 14 & adult_order == 1

replace weight = 0.5 if age >= 14 & adult_order > 1

bysort serial: egen adjusted_hh_size = total(weight)

generate adjhhincome=HHINCOME/adjusted_hh_size

I use person weights, but even when I try household weights, the adjusted household incomes are still extremely high. I would appreciate it if you could help me identify what mistake I am making here.

While our user support team does not provide code review, from experience I have found that a common mistake with income variables is forgetting to account for specialty coding. The codes tab in the documentation for HHINCOME notes the following specific variable code:

9999999 = N/A.

Not applicable specialty codes are assigned to persons who are not in the universe for their corresponding variable. The definition of who is in the universe for a particular variable is provided in that variable’s universe tab; in the case of HHINCOME, persons in group quarters (GQ = 3 or 4) as well as vacant units (GQ = 0) are outside the universe definition (since there is no household to calculate household income for) and are therefore assigned this specialty code. Note that the majority of variables include this type of specialty coding. These codes are retained in the data to allow researchers to independently determine how to incorporate these cases into their analysis. You might consider dropping cases with HHINCOME = 9999999 before running your analysis, with the caveat that your estimates will exclude persons living in group quarters.

Dear Strahof,

Thank you so much for your response! Yes, I realized that my mistake was not dropping the missing values, as you mentioned, as well as the values smaller than 0.