I’m working with the 2016 ACS, but when I was looking through the technical documentation available on the census website I noticed that the sample size I had was different than what census said the PUMS included. I thought it might be because the sample sizes provided by the census also included the sample from Puerto Rico, but it also carried over when I look at specific states. For example, the census document says there should be 24,100 housing units from Minnesota in the sample, but I only have 21,531 between values of 1 and 2 for the variable “GQ”. Similarly, the census says there should be 140,606 sample units for California, but I only have 132,809. These aren’t huge differences, but I was wondering if I doing something wrong or if there are just slight differences in the PUMS sample described by the census and the one available here.
Note: I calculated the state housing unit numbers above by looking at the number of observation in which the value for PERNUM was 1 and the value for GQ was 2 or 3 for the appropriate STATEFIP values.