How do I calculate total housing units for the 2021 ACS?

Hi, I’m trying to find the total housing units in San Francisco using the 2021 1-Year ACS PUMS estimates. I tried using the [UNITSSTR] variable but the numbers are significantly different from what the Census website shows. I’m using Stata. Any help is appreciated.


To be clear, the number I’m getting is the total occupied housing units.

In order to help you understand what is going on, I’ll need you to provide some more information.

1: how exactly did you do the calculation? Did you use weights?

2: what is the number you are trying to replicate, and how far off is your estimate?

3: where did you get the number from the Census Bureau? (include a link)

Hi Matthew, here are the answers to your questions:

  1. Stata code: tab numprec [fweight = hhwt] if pernum==1 & (gq==1 | gq==2)
    This gives me the total occupied housing units. I’m wondering how I can get total housing units.

  2. The Census website gives me a total of 412,269 housing units. The calculation above gives me 350,796 (which is the total occupied housing units).

  3. Here is the link: Explore Census Data

Thank you for your help!


By default, IPUMS data are rectangularized on the person-level. This means that information from the household records is appended to the person-level records for all persons residing in the household; the values for household-level variables will be the same for all persons in the same household. This structure is appropriate for person-level analyses. However, you are interested in a household-level analysis and are not only interested in occupied housing units; vacant housing units are omitted from the data file under the default record structure as they contain no person records. Fortunately, you can request a hierarchical extract if you would like to preserve the person-records within households or a household-records only extract if you do not require any of the person-level data. I requested a hierarchical file and a cross-tab of group quarters status by vacancy yields the exact 412,269 estimate in the table you linked (see screenshot below).

(thanks to @KariWilliams for investigating this)

Thank you! That makes sense.