I’m trying to replicate estimates from ACS Table B19126 for CA using the 2014-2018 sample, specifically looking at median family income for male and female householders with no spouse present with own children under 18 years of age. My estimates are outside the table’s margins of error.
I think I have coded the “male householder, no wife present, with own children under 18 years of age” and the “female householder, no wife present, with own children under 18 years of age” correctly. My estimates are very close to the estimates provided in Table B11003.
I suspect that my treatment of the income variable is incorrect, and I was hoping you could steer me in the right direction. I’ve included my Stata code below, but I’ll summarize here, too. 1) Use the Census Bureau variables to construct family units within the household. 2) Identify never-married own children under the age of 18 by family and create a count by family. 3) Create a total family income variable using inctot. (I first zero out the NAs. I only add income of related individuals.) 4) Adjust for inflation using adjust variable. 5) Calculate median family income using epctile and household weights.
I’m getting 45,378 for men and 29,596 for women. The ACS Table reports 46,368 ±667 for men and 30,677 ±257 for women.
Any suggestions or advice would be greatly appreciated.
/*This code replicates estimates in ACS Table B19126, but first we want to make
sure we have the correct population and weights. I use B11003. We can’t use
the IPUMS created variables to replicate family estimates.
First, i make a serial # specific to families in the household using the CB’s original serials. */
egen serialfam = concat(cbserial cbsubfam), format("%15.0g")
/Next I tag never-married children in the family under the age of 18. “Own child”
is defined as biological, step, or adopted./
gen ownchild = 0
replace ownchild = 1 if (related==301 | related==302 | related==303) & age<18 & marst==6
/Then get a count of how many own children under the age of 18 are in the family.
This only works for primary families, because relate is a variable that shows the
relationship to the head of household./
by serialfam, sort: egen n_ownchild = sum(ownchild)
sort serialfam pernum
list serialfam cbsubfam pernum relate age sex ownchild n_ownchild in 1/50 //it works!
*Now take a look at the estimates in B11003. They are very close.
total personMIL if n_ownchild>0 & relate==1 & marst!=1 [iw=hhwt], over(sex)
*Create a family income variables. First zero out the Not Applicable code. Mark as missing because the CB includes families with zero income when calculating median family income.
replace inctot=. if inctot==9999999
by serialfam, sort: egen family_inc = sum(inctot) if relate<=10
sort serialfam pernum
list serialfam pernum relate age sex inctot family_inc in 1/50
*We have to adjust income variables for inflation.
gen family_incADJ = family_inc*adjust
epctile family_incADJ if n_ownchild>0 & relate==1 & marst!=1 & family_incADJ!=. [iw=hhwt], percentiles(50) over(sex)