I’m using the monthly data and I’m trying to aggregate population totals by period. I’m using code that looks something like this:
gen pop_start=1
collapse (sum) pop_start[iw=wtfinl], by(countyreal year month)
The problem is, I’m getting results that bounce all over the place. In any given county, the population from month to month will jump wildly up or down.
What am I doing wrong here? Might it be that I can only aggregate to the national level, and not the county level?
This is a graph of the average population by county over time:
I’m a new user so I couldn’t add more than one picture, but I wasn’t to include these, too.
This is an graph of total weight from all counties over time. This looks right and follows population totals for all these years.
And finally the averge weight by county. It doesn’t have the nice, consistent upward trajectory that there is at the national level.
There are two limitations to using CPS data at the county level that likely cause your observations. First, as is noted on the description tab of the COUNTY variable, to preserve respondent confidentiality, not all counties are identifiable in all samples. Specifically, only about 45% of all households reside in an identifiable county in any given sample. Second, the CPS sampling methodology is designed primarily to calculate reliable national estimates. Therefore, even the larger sample size in the ASEC samples are known to be “not as reliable” at the state level. See page 2-8 of this document for more discussion. Therefore, this issue will be even more dramatic with county level estimates from the basic monthly samples.
That is very helpful, Jeff. Sounds like the solution is “by design there is no solution”.