# HIU, family income, and household characteristics

Dear IPUMS staff,

I’m sorry this may be a long question. I’m looking at the health insurance unit “HIU”, which seperates people live in the same house to smaller families based on the family relationship. HIU is usually used as a more accurate way to define “family size” and thus the eligibility of government subsidised insurance program. So I’m looking at the distribution of some household characteristics by family income, and I find there are some weired outliers for a specific income group and I’m curious why is that?

I’m using ACS 14-16 data. Here is the Stata code I used to check the distribution, which I think you can run it directly:

// I first define the poverty level for each family based on the family size //

gen FPL = hiufpgbase if hiunpers == 1

replace FPL = hiufpgbase + hiufpginc * (hiunpers - 1) if hiunpers != 1

// calcuate family income by summing up all individual’s income //

replace inctot = . if inctot == 9999999

bysort year hiuid: egen HIU_inc = total(inctot), missing

drop if HIU_inc < 0

// calculate the family income relative to FPL //

gen ratio = HIU_inc / FPL ** so ratio = 1 means family income is at 100% FPL **

// Limit the sample to people age 27-64 & family income below 200% FPL //

keep if age <= 64 & age >= 27

keep if ratio < 2

/* draw the graph to see the distribution, I use command “cmogram”, which group the people to equal size income bins and plot the average outcome of each bin. I checked some household characterstics such as age, sex, employment status */

gen male = (sex == 1)

gen employed = (empstat == 1)

cmogram age ratio, cut (1) histopts(bin(40)) scatter line(1) qfitci

cmogram male ratio, cut (1) histopts(bin(40)) scatter line(1) qfitci

cmogram employed ratio, cut (1) histopts(bin(40)) scatter line(1) qfitci

It seems that there is something wrong for people with family income around 70%-75% FPL. The age is much older, and percentage of being employed is much smaller. but being male looks fine. I also check many other outcomes, for example, marriage status, have college degree, etc, and find similar phenomenon. Do you have any idea what happen to this group of people? Since the sample size is very big, I don’t think there should be these outliers.

Thanks a lot!

What you have observed does appear to be odd. After digging into the data you are using a bit more, it looks like this clustering around 70-75% could be related to Social Security Disability Income (SSDI). Under SSDI, you must earn less than a certain amount per month. Since older people are much more likely to be disabled, it may be why you see this clustering of older individuals. Looking at the data, here are a few things that can be seen which might help to explain what you observe: HIUNPERS is, on average, smaller than for those outside of this 70-75% range, individuals with a disability were more likely to fall in this range than those without a disability, and the mean INCTOT of those in this range who received any social security income was slightly lower than those who did not receive any social security income.

I can’t say for certain what is happening here, but perhaps this information will provide some clues. You may wish to explore the data deeper using our available income and disability variables.