Dear folks –
I have downloaded a file that contains most of the IPUMS-CPS variables, 1962-2012. I have managed to bite off the top 100 records, and there are some things I am confused about.
- The RECTYPE for every variable is H.
- Whenever NUMPREC is n, there are n consecutive rows with that NUMPREC. This suggests I am looking at person-level records with the household information duplicated into each.
- For those same n rows, the HWTSUPP is always duplicated n times. I hope this means that it has been divided by n before being copied into each person record. Otherwise these weights will overweight larger households by a factor of n, correct?
- Some of the person weights and the household weights seem to be out of alignment by row. For example, the first two households with NUMPREC=2, in rows 8 and 10, each have the household weight duplicated in the next consecutive row, and the WTSUPP likewise. (These are at the beginning of 1962) But the first household with a NUMPREC=3, SERIAL=13, is in row 17, with the value duplicated in rows 18 and 19 of HWTSUPP, while the first run of three consecutive identical weights in WTSUPP begins in row 19 and goes to row 21. The first household with NUMPREC=4, SERIAL=29, begins in row 41, again with four consecutive identical HWTSUPP values. However, I was not able to find four consecutive individuals with the same WTSUPP anywhere in the first 150 lines. Am I wrong that two individuals in the same household must have the same sampling weight as one another, equal to the household weight over n?
Sincerely, Andrew Hoerner