I’ve run into inconsistencies with the second-stage weights (WTFINL) for April and June 2001. For these two months, the estimates for population calculated using WTFINL are significantly lower than the expected trend and the published numbers.
It appears that the weights based on the 2000 population controls were found to be problematic, giving estimates that were far too low. It was noted in a forum thread that this was due to mismatches in identifiers (here). As a result, on August 6, 2015, these two months were reverted to the original weights, pre-2000 population controls (noted here).
While the original weights don’t lead to as significant a drop in population estimates as the ones based on the 2000 population controls, the drop is still meaningful, as seen in the figure above. In addition, the sample sizes fluctuate quite a bit, with April and June 2001 having significantly more observations than March and May. This was noted in a forum thread from 2016 (here).
I’m following up on this issue - has the problem with the weights based on the 2000 population controls been resolved? And if not, does IPUMS have a way for handling the drop in population we continue to see in April and June 2001 when using WTFINL?
You are correct that an issue with WTFINL in the April and June 2001 CPS samples was corrected by IPUMS in 2015. The issue was that the new weights introduced by the Census Bureau, which incorporated population controls based on 2000 census data, produced implausible estimates for the April and June 2001 samples specifically. The issue was isolated to those months. While the Census Bureau has not issued a corrected weight for these samples, IPUMS made the decision to replace the 2000 population-based weights with the weights that use 1990 population estimates for just the data April and June 2001. These two samples therefore use a weight that’s slightly different from the weight used in neighboring samples. The original, unharmonized weight variable used in these two samples is UH_ZWGT_2001, and it uses 1990 population controls. Because the U.S. population was different in 1990 versus 2000, we expect that population estimates in the April and June 2001 samples will look different from population estimates in neighboring samples, which use weights based on the 2000 population.
IPUMS does not have a specific recommendation of how users should treat these data, given that they use different weights. We recommend that users publishing research using these samples note in their work that the weights in April and June of 2021 use different population controls than the weights in the other 2001 samples.
The sample sizes in the April and June 2001 samples are larger due to an oversampling of State Children’s Health Insurance Program (SCHIP) participants in April, June, July, and August of 2001.