How to identify same sex couples


how do I identify same sex couples (married or not)? Please if possible write the exact STATA or R code so that it is 100% clear.

Thank you & Best regards,


I realized that my question was held a bit too short:

I am talking about the American Community Survey for 2000+.

I used two methods to identify same-sex couples:

  1. I used SPLOC: If the gender of the two individuals connected by SLOC is the same they are cathegorized as homosexual:

The stata code for this is liek this:

use ACS, clear

drop if SPLOC ==0 <- SPLOC=0 means no partner

rename SEX SEX_P

rename SPLOC X

save tobemerged

use ACS, clear

gen X = PERNUM

merge 1:1 year serial X using tobemerged

The result is a dataset in which each individual with a partner has a variable with the

gender of their partner (i.e. SEX_P).

The next step is simply:

gen homosexual =0

replace homosexual =1 if sex == sex_p

  1. The second method is similar. I used the variable RELATED to identify same-sex couples.

It is a bit more complicated but the idea is, that RELATED shows the connection to the head of household (i.e related=101). If a individual within the same household is related to the head of household (=101) as spouse(=201) or non-married partner (=1114) and both share the same gender they are identified as homosexual.

Both methods come to the same result. That is they identify exactly the same people as homosexual. However, the variable SSMC that identifies same sex married couples is not in line with this identification.

I am aware that SSMC is on the house hold level. That is all people living in a house hold of a married homosexual couple get identified as homosexual. To bring SSMC on the individual level I simply recode it (I also drop all cases were alocation is a bit shaky (i.e SSMC =1):

drop if SSMC ==1

recode SSMC (2=1)

repalce SSMC ==0 if married > 2

If I now make a corss table between the dummy for beeing a homosexual couple and beeing a same sex married couple I get the result that there are 27,264 individuals who are in a homosexual marrige but are not identified as homosexual with my two methods. Where is my mistake?

Thank you in advance and best regards,


I think the issue here lies in that you are attempting to match frequencies based on two variables that are not formulated using the same methodology/are not available for the same samples. See my notes here on the differences between the two.

Here is an example of Stata code, which relies on examining relationships to the head of the household only, that you could use to determine same-sex couples using SPLOC and RELATE (you will need to attach spouse characteristics for SEX – sex_sp and RELATED – related_sp when creating your extract):

keep if pernum == 1
keep if sex == sex_sp
keep if related_sp == 0201 | related == 1114

Because SPLOC and SSMC are created in different ways, considering just same-sex married couples, you will end up with two different values depending on your approach. Which approach you use is ultimately up to you and will likely depend on the scope of your analysis.

I hope this helps.

Dear Michelle,

First, thank you for your responses and for being so helpful! You have already helped me a lot.

I have to ask a few additional questions, just to ensure that I have a correct dependent variable and don’t publish something wrong. My questions relate to your answers from either this threat here or this threat here:

You wrote:

“Because SSMC is a household level variable, keep in mind that you will want to remove other members of the household that are not spouses.”

The spouse would always be the spouse to the head of household right? So I would want to keep the spouses AND the head of household, right? In other words if the household is a Same Sex Married Couple- Household the Spouse as well as the head of the household are both together in a same sex marriage?

You write that to identify same sex couples I should use this code:

“keep if pernum == 1
keep if sex == sex_sp
keep if related_sp == 0201 | related == 1114”

I have the same issue as with the first question: Why wouldn’t I also keep the head of household in? (related shows the relationship to the head of household thus if one person in the house hold is a spouse, that automatically makes the head of household part of a couple as well). If you do “keep if related_sp == 0201 | related_sp == 1114” (I assume you meant to write related _sp ==1114) you loose all the heads of household because their related_sp==0101.

So overall, I would think I should do it like this:

*First I recode the RELATED variable in a way that the combination of their values for couples is unique.

recode related (101 = 1) (201=5)(1114 = 10), g®

recode related_sp (101 = 1) (201=5)(1114 = 10), g(r_sp)

*Now I combine them. This new variable has two unique values: 6 (Head of household and spouse) 11(Head of household and unmarried partner)

rr = r + r_sp

*A person is now part of a same sex couple if there is a relationship were one is head of household and the other one is related to the head of household by being his/her spouse or unmarried partner and both have the same gender.

gen ssc = 0

replace ssc = 1 if sex= sex_sp & rr <=11

Now the same sex couple variable (ssc) includes the head of household and the spouse/partner.

Now when I construct my SSC variable like this and recode SSMC so that only head of household and their spouses have the value one then all SSMC=1 are also SSC =1 which is nice. It seems to all fit finally …thanks also to your help =).

In summery it appears to me that I have solved the issue, but to make sure that I m not overseeing something here my questions in short:

  1. For both variable SSMC and SSC; Should the head of house hold also be coded as ==1 as they are part of the partnership?

Thank you again for all your help and please let me know if I am overseeing something here.

Best regards,


I’m a bit confused about what your research question is and the parameters of your research, however, I will do my best to address what you have laid out here in different pieces.

Starting with SSMC:

The reasoning behind the following statement: “because SSMC is a household level variable, keep in mind that you will want to remove other members of the household that are not spouses” is that the value of SSMC is assigned to ALL members of a household. For example, let’s say SSMC == 2. A child living in this household will also take the value of SSMC == 2. If you are interested in same-sex married couples, then yes, you would want to have information for both the head of the household and the spouse. This could be accomplished by keeping data for both, as you have described. However, if you are using the attach characteristics, then there would be no need to keep both. How you structure your data depends on your analysis, so you will have to decide which is best for you.

Regarding your constructed SSC variable:

The methodology you have outline will yield the same result as the following (which I proposed in a previous answer):

keep if pernum == 1
keep if sex == sex_sp
keep if related_sp == 0201 | related_sp == 1114

In both cases, you will have same sex couples in which 1 person is always the head of the household. The difference between the two is just that I utilize the attach characteristics, while your methodology relies on individual observations. In essence, my approach yields 1/2 the number of rows that yours will because spouse/unmarried partner characteristics are in the same row as the head of the household.

So, if I am understanding your question correctly, yes, by both SSMC and your construction of SSC (which, keep in mind are two different approaches to identifying same-sex couples that rely on different methodologies), you will always have 1 member of each partnership that is the head of the household.

I hope this helps. If there is still some confusion, perhaps you can email along with your full Stata code and a more detailed description of your analysis.

Thank you Michelle,

It is all clear now. I guess it was just that I don’t know of this attache method. However, as my method is feasable as well my problems are solved.

Thank you very much!

Best regards,