I am currently doing research that requires to identify at least which county the individual lives in. But I found that a lot of counties have no observation (IPUMS-CPS). Here are my questions:
- I know there are some counties not in the CPS sample and some counties in the sample but not identifiable due to confidential reasons. Are all counties listed in this page in the sample?
https://cps.ipums.org/cps/codes/county_20042005_codes.shtml
If the county codes in this page have no observation, may I assume that residents in this county are assigned county code 0 for confidential reasons?
- Within the metropolitan area, I found that the portion of observations from counties in this Metropolitan Statistical Area (MSA) is not close to the population portion of this county in this MSA. I downloaded basic monthly data from 2004 May to 2017 December. For example:
The metfips 37980 (Philadelphia-Camden-Wilmington, PA-NJ-DE-MD Metropolitan Statistical Area) contains county codes 34005,34007,34015,42017,42029,42045,42091,42101,10003,24015,34033 by definition of this MSA. And from ACS one year population estimates, I know that the population of county 10003 (New Castle County) is approximately 9% of the population in this MSA. But the observation with county code 10003 is 38.61% of the observations with metfips code 37980.
I am confused by this inconsistency.
I want to know does this mean actually not every household in the MSA has equal possibility of being selected?
Is the assumption that the number of observations in each county within the MSA is proportional to the population of this county wrong?
-
Due to so many examples in the question 2, I want to ask if it is possible that some individuals in one county (for example 01073) are assigned county code 01073 while others are assigned county code 0?
-
In some MSAs, the county codes are all zero for observations with this metfips code. For example the metfips 25540 has no observation that can be identified at county level.
I want to know if the observations are proportional to the county population. Is there any way I can identify the residence at the county or even city level for these observations?
- I noticed there are some counties that can be identified during some time period but not after a specific time. How does this happen? Is there a so large population decrease that makes a county not identifiable?
Thank you so much!