Weighted household count

I am trying to calculate the weighted frequency of unique households within the levels of a categorical variable. I know I need to use SERIAL for distinguishing unique household IDs. I have this code in SAS, but I can’t figure out how to incorporate HHWT.

proc sql;
select race, count(SERIAL) as unique_count
from (select distinct race, SERIAL from dataset)
group by race;
run;

but Does anyone have tips or code for either R or SAS?

It’s not totally clear to me what you’re hoping to accomplish. Do you want the estimated total number of households headed by Black individuals (for example) in the country? If that’s the case, try this modification:

proc sql;
select race, sum(HHWT) as unique_count
from dataset
where pernum=1
group by race;
run;

The PERNUM=1 keeps only one record per household, and it’s the head of household. Once you’ve restricted the dataset in such a way, you don’t need to worry about SERIAL. Finally to get the total estimate, you sum the household weights (HHWT).

Thank you! I should have used my actual variables and explained them rather than using race as a place holder. So, I constructed a variable called “doubled up” that describes an individual living in an extended/overcrowded (complex definition based on relationship to household head and more) household. Rather than only quantifying the number of doubled up individuals, I want to quantify the number of households that contain ANY doubled up individual, hence the use of SERIAL.

Here’s a way I’m thinking about it, combining what you sent with what I had originally. But, this doesn’t run.

proc sql;
select doubledup, sum(HHWT) as unique_count
from (select distinct doubledup, SERIAL from tmp2.final2019jan10)
group by doubledup;
run;

Any additional advice now that I’ve explained it better? Thank you!

That code looks ok to me. It should give the correct # of households with doubled-up individuals. But the other category will count the number of households with any non-doubled-up individuals, so there may be double-counting of households there, so just be aware of that.

I think the problem is that the table in your FROM statement needs to include HHWT.