I have been trying to count all the households in my IPUMS CPS extract, but for some reason I haven’t been able to get the number to match the total number of households listed on the website here.
Following the survey documentation, I have been identifying households by serial * year. However, when I count the number of households by this variable combination (bysort serial year: egen tag = _n == 1), I get approximately 20,000 fewer households in the 2000s than what is listed on the website. I get the same (lower) number when I simply count pernum == 1.
As an example, in the 2013 extract I get a count of 74,821 households using both of the methods listed above, while the number listed at the link above for 2013 is 98,095. Any suggestions about what I am missing here?
Thanks!