I found that there are a number of PUMA codes in the ACS (100, 101, …) that are assigned to ALL states. In other words, I have various households (serial) in all U.S. states assigned to the PUMA code 100 (and 101 etc.)–this odd duplication of PUMAs is the case for a significant portion of my data set.
(I am aware that PUMAs are not unique. It is of course fine that certain PUMAs are assigned to two areas in different states.)
Is there something wrong with my data set? Is there a special significance to certain PUMA codes (i.e. 100)?
There is nothing special about PUMA codes one way or the other. The Census Bureau makes some moderate attempt to keep PUMA numbers consistent year to year, but the ultimate fact that PUMA numbers overlap between state is of utmost annoyance to the users. And no, that won’t change in 2020 Census – I’d be stunned if it did.
What I usually do is
generate long unique_PUMA = st*1e5 + puma
so that I can see both the state and the PUMA number in that combined ID. I think in some of the Census products, you do see the PUMAs thus appended together, as well… maybe with a dot between them or something.
Thanks for this reply. I’m not sure what I’m experiencing is just that though. My data looks like this–covering about 1/3 of all households. (I’m sorry about the formatting–should all be columns.) I cannot believe that this is right? It renders PUMAs almost entirely useless.
puma state
100 colorado
100 oklahoma
100 tennesse
100 iowa
100 missouri
100 massachu
100 … all states 101 connecti 101 californ 101 californ 101 florida 101 californ 101 … all states . . . 1400 iowa 1400 missouri 1400 indiana 1400 maryland 1400 south ca 1400 missis 1400 … all states
As noted on the codes tab of the PUMA variable, PUMA codes are state dependent. Therefore, they must be interpreted in combination with a state id variable.