Hi, how does one calculate tenure by race and ethnicity? I specifically want to know the ownership rates for non-hispanic whites, non-hispanic blacks, etc. and latinos. I have this code for St. Louis County (MO) but the rates are slightly different that what’s reported online in other sources:
tab ownershp if year==2022 & statefip==29 & countyfip==189 & pernum==1 & ownershp!=0 [fweight=hhwt]
Below is the screenshop from the Census website of the numbers I’m trying to replicate (does not split it by race/ethnicity) - starting reference point.
Thank you in advance!
When comparing your estimates to official estimates published by the Census Bureau, we generally expect estimates produced with IPUMS microdata to be within the Census Bureau tables’ margins of error. Your estimates and the Census Bureau’s official estimates will not be identical because the Census Bureau produces their estimates using data that is only available to them internally. The data IPUMS USA provides is the public use version of the ACS data. These internal versions of the data include additional records and additional detail in some variables; we have a blog post that covers this topic and may be of interest to you.
I will provide some information on how I would approach estimating ownership rates by race in St. Louis County using IPUMS USA data.
You’ve correctly narrowed your household-level analysis to one respondent per household using PERNUM==1. To define non-Hispanic whites, non-Hispanic Blacks, Hispanics of all races, or other racial/ethnic groups, you will need to use RACE in conjunction with HISPAN. Hispanic origin is reported separately from race. I suggest using RACE and HISPAN to define the categories you are interested in within a new race+ethnicity variable (keep in mind the estimates you are comparing to may or may not use mutually exclusive race/ethnicity categories). Note that you are calculating a household-level parameter by a person-level parameter here (i.e., the homeownership rate by the race of the household reference person).
The appropriate weights to use are pweights, rather than fweights. In Stata, fweights are frequency weights, and indicate the number of duplicated observations. There are not duplicated observations of individuals or households in the U.S. decennial census or American Community Survey within a given sample. Probability weights (pweights in Stata) are sampling weights that adjust for the probability that an observation is included in the sample. To estimate empirically derived standard errors (or confidence intervals), you should additionally use replicate weights by adding REPWT to your extract.