I’m using the EARNWT variable to weight data from the 2015 monthly CPS. I’m only looking at those observations where the MISH variable = 4 or 8. Despite this, when I use the EARNWT as a weight in STATA I’m getting an incorrect number of observations in my regression (- 16 trillion). Should I only use 1 month of data rather than using the entire year’s worth? That is, should my regression include a filter that only calculates the data in January, for example?
You are correct. Because each month’s EARNWT values are meant to weight the sample up to the population total, each month’s worth of EARNWT values will create a total population estimate. Pooling 12 months of data will create weighted estimates 12-times the size of the total population. To create an annual figure from these monthly data you could create monthly figures (as you proposed for January) and then calculate the average of those estimates. Otherwise, you could divide each person’s EARNWT value by 12 before calculating the weighted estimates. Both of these methods should have very similar results.