How reliable is ACS before the actual implementation in 2005. The sample sizes are much smaller for 2000-2004. Especially 2000 stands out - 587,519 addresses were in the final interview but the data in IPUMS has only 371,618 individuals. Usually there are more individuals in a survey than households interviewed. Did anything happened in 2000? For 2001-2004 the pattern is similar to later years - more individuals in the data than interviewed households. I assume the sample sizes are smaller for 2001-2004 because it was a testing period?

Apologies for the delayed response on our end!

You are correct that the sample sizes for 2000-2004 are smaller than those for later years because of the testing period. If you take a look at the sample descriptions, you’ll see that those years have between a 1-in-750 and 1-in-232 random sample, while all later years have a 1-in-100 sample. However, I am unsure why the number of addresses in 2000 is so much greater than either the number of individuals or housing units in the IPUMS sample. I have forwarded this question to our USA team and will let you know as soon as I hear back.

The ACS public-use microdata sample (PUMS) contains only the publicly available subset of individuals sampled from the full ACS. The Sample Design section on page 10 of the 2000 PUMS accuracy statement notes that approximately 1-in-4 of all housing units were selected from the ACS for each state through a stratified selection procedure. This is why there are 148,841 households in the PUMS data rather than the 587,519 addresses mentioned in the ACS documentation. In later years this number increases to two-thirds of all cases as described by this Census FAQ.

