For a person in a household with a specified household identifier CPSID, how many, and which, digits are supposed to be the same in their person identifier, CPSIDP?
If you have a person’s CPSID in a file, can the digits of CPSIDP which are different be determined from other variables?
The reason I am asking is that I’ve divided my (almost) complete IPUMS-CPS data into 22 files, and I think I have established that something like like a third of the aggregate size comes from repeated variables added automatically. I need all those variables, but I don’t need 22 copies of each of them. So I am trying to identify the minimal set of variables I need to link individuals and households within the ASEC (only) and keep those in every file, for safety’s sake, while keeping the rest only in the files for their respective variable groups. I think that minimal set consists of the CPSID and any digits in CPSIDP that are, or can be, different from those in the household they are in. Does that seem right?
Are the answers to questions (1) and (2) above unchanged if a person moves into an ongoing household after the first month that the household is in the survey? How is PERNUM handled for such individuals? Will they always be higher than any of the PERNUM values for people present in the household in the first month the household is in the survey?