Unreliable DWELLING and DWSEQ in 1920 complete data?


I’m trying to calculate average number of households per dwelling by enumeration district, using the complete 1920 dataset. My plan was to use DWELLING and DWSEQ… but most dwellings in the data seem to have DWSEQ values up to 99. (The average DWSEQ max for New York City is 99.95. I’ve sampled data from a few rural regions and found similar numbers.)

An average of 99 households per dwelling strikes me as fairly unrealistic even in urban settings, especially since the 1900 5% sample gives much lower values for the same areas (for NYC, average DWSEQ max is ~7).

Any chance that DWELLING and DWSEQ haven’t been cleaned in the preliminary release of the 1920 complete dataset? Or am I missing/misreading something?


Followup: Unreliable DWELLING and DWSEQ in 1920 complete data?

Sorry for the delay responding to this question. It is what you expect, these variables are are not complete for the full-count file. We will be hiding this variable on the external site, until the variables have been completed, so to avoid future confusion. Again, sorry for the inconvenience.