I have checked CPS survey in 2023 sep for pwsswgt, pwlgwgt and pwvetwgt. They do not sum to US population size, rather about 10^4 times larger than US population. Similarly for CPS supplement PWNRWGT and PWCMPWGT variables do not sum to US population size as well and they sum to a number about 10^4 times larger than US population size.

Is this an intended effect? Ratios/regressions does not matter. However, if I want to estimate number of unemployed people, it matters significantly. Was there a document specifically indicating that total weight is not calibrated to population size?

The weights you are referring to come from the CPS data from the Census Bureau, and are not IPUMS CPS variables. I see in the Census Bureau’s methodology documentation that “all of these variables have an implied four decimals, though the decimal itself is not included. Hence, each weight must be divided by 10,000 when used.”

I noticed a discrepancy in CPS survey weights (pwsswgt, pwlgwgt, pwvetwgt, PWNRWGT, PWCMPWGT) that don’t sum to the US population. CZ_A mentioned a 10^4 times difference. Is this intended? Any documentation on the weights not being calibrated to population size? Looking for clarification on this issue. Thanks.

I think they intentionally did it. My guess is that it might allow better precision for weighting, though overall weighting scale does not matter for ratio estimation. If you care about a particular total estimate, then it might matter in practice. That is my guess. There are surveys in which total weight is not calibrated to total population, but up to a constant scaling factor. The moderator posted a document link includes the instruction, though the document does not have explanation for the reason behind this.

As Isabel mentioned in her previous post, the weights you are referring to come from the CPS data from the Census Bureau, and are not IPUMS CPS variables. In the Census Bureau’s methodology documentation, it states that “all of these variables have an implied four decimals, though the decimal itself is not included. Hence, each weight must be divided by 10,000 when used.” I am guessing this is the source of the discrepancy you are noticing with these variables.

If you are interested in using IPUMS CPS, here is a crosswalk of the names for the weight variables:

Thank you for clarifying the source of the weights and providing insights into the potential discrepancy. The tip about dividing the weights by 10,000 when using Census Bureau data is particularly helpful for those navigating through the variables. Utilizing IPUMS can indeed enhance the consistency and reliability of the data analysis process. found project

Hi Isabel, even when you divide by 10000, when you do the sum, you get around 3,2m. I am not sure actually what is wrong here and I appreciate guidance, Thank you

The weights referred to in this original forum post (pwsswgt, pwlgwgt, pwvetwgt, pwnrwgt, and pwcmpwgt) are original Census Bureau variables. These are not IPUMS variables. Upon reviewing the Census Bureau’s documentation relevant to these variables, I see that they "have an implied four decimals, though the decimal itself is not included. Hence, each weight must be divided by 10,000 when used.” I would recommend confirming that your calculation does divide the weights by 10,000. Also, make sure that your data file is intended to be representative of the entire U.S. non-institutionalized population. The weights in a dataset should sum to the number of individuals represented by the dataset. If the population represented by sample is smaller than the U.S. population, your weights would not sum to the size of the U.S. population. If there appear to be issues with these variables in the Census Bureau’s original data, we would recommend contacting the Census Bureau directly.