Basic Question about Interpreting Merged Data


When analyzing more than one month of CPS data, how do I determine the total population count? I undersatnd how to apply weights to find the population count for one month of data, but I’m having difficulty with the concept of merged data. For example, I understand that I could look at the number of employed workers in January 2014 by applying earnwt to empstat. When I look at full 2014 data and do the same thing, I see that the count is roughly 12 times as large. To arrive at an average, can I simply divide this yearly count by 12? I suspect this is too simplistic so would appreciate your help!

Thanks very much.


Simply dividing by twelve would work to provide an “average” figure over all of 2014. Since each month is weighted to the total population of the U.S. (which presumably does not change greatly from month to month) applying the weight to all respondents across all months is the same as estimating each months figure and then adding them together. Therefore, dividing by twelve yields the average over that twelve-month period.

When combining multiple monthly samples from the CPS it is important to consider the panel-like nature of the survey. Since CPS respondents are a part of the sample for a total of eight months (see MISH description), combining multiple monthly samples will inevitably include multiple measures of the same subset of people. This could potentially introduce bias into estimates based on a pooled sample of 12 months meant to represent the total population over that time period. However, averaging the monthly estimates as you propose still maintains the idea that the samples are representative within their own months.