I am trying to estimate the number of Native Hawaiians (NH) including part-NH in Hawaii based on 2010-2014 ACS 5-year sample. I coded NH based on the detailed codes of the RACE variable to include NH and NH who are mixed with other race groups and restricted the sample to Hawaii. My weighted number of NH including part-NH is 400,662, which is way higher than the estimate of 295,409 “Native Hawaiian alone or in any combination” from Census table (http://factfinder.census.gov/bkmk/table/1.0/en/ACS/14_1YR/S0201/0400000US15/popgroup~062).
Here are my codes to create the NH estimate:
replace NH=1 if raced==630 | raced==821| raced==861 | raced==862 | ///
raced==863| raced==864 | raced==911 | raced==912 | ///
raced==913 | raced==914 | raced==964
tab NH [fw=perwt]
I notice that for categories of “raced” such as “862 Chinese, Filipino, and Native Hawaiian (2000 1%)” have additional information of 2000 1% in the parenthesis. The Comparability of the RACE variable mentions something about 1% and 5% sample (https://usa.ipums.org/usa-action/vari…), but I am not sure what it means to the estimate I want to get. To check if these categories are the reasons for the differece of my estimate and the number from Census tables, I also excluded these race categories with additional information in the parenthesis. That is, I coded as follows instead:
replace NH1=1 if raced==630 | raced==821| raced==861 | ///
raced==864 | raced==911 | raced==914
tab NH1 [fw=perwt]
The returned estimate is 342,838, which is still higher than that in the Census tables.
I am not sure what is wrong and would appreciate any help on this!