I am using ACS one-year 2014-2019 surveys to estimate MSA-level citizen employment and wages. I have filtered my extract to METRO = 2, Age 18-64, EMPSTAT 1, CLASSWKR 2. In R with dplyr, I filter to CITIZEN values of 0, 1, or 2, then group by PWMET13 and YEAR for the metropolitan place of work in a year, then estimate citizen employment as the sum of PERWT and average wages as the sum of INCWAGE times PERWT over the sum of PERWT. Doing this yields very small employment estimates for some MSA-Years. For example, I get estimated citizen employment in PWMET 30780 YEAR 2019 of 88, which definitely seems unrealistic. Other MSA, such as PWMET13 31080 in 2016 with my estimate of 12.5 million employed citizens, seems consistent with Google search results showing a MSA population of 13.3 million. So some of my estimates using this method seem accurate, while others seem to vastly understate MSA citizen employment.
Am I going about this process incorrectly? Should I perhaps be using CPS or multi-year ACS samples? Any help would be appreciated. If any more details are necessary, please let me know.
Do you want statistics for workers who live in each metro or those who work in each metro? PWMET13 identifies the metro area where workers work, and MET2013 identifies the metro area where people live.
METRO also pertains to place of residence, distinguishing not just metro status but also residence in central/principal cities within metro areas. METRO=2 limits your extract to those who live in a PUMA that lies entirely within central/principal cities of a single metro area. That is a subset of all individuals who live in a whole metro area. See the METRO code list here.
If you want a summary of employment and wages for individuals who work in a metro area, you should filter by PWMET13 != 00000… which gets you all workers who work in an identifiable metro area (which is not all workers in metro areas… only those working in Place-of-Work PUMAs that are associated with identifiable metro areas).
I recommend reading the PWMET13 description and/or METRO and MET2013 descriptions to understand better the meanings and limitations of these variables.
Thank you for your reply. I tried this again with removing the METRO == 2 filter and got much more convincing totals. I am interested in places of work, so I focused on the PWMET13 variable.
I do have a followup question; If I use these estimates in subsequent regressions, is there something I need to do to correct standard errors produced from these estimates?
How to adjust standard errors depends on your specific analysis. In general, we recommend using the replicate weights (variable REPWTP) to calculate standard errors of estimates from the ACS data. However if you are calculating these estimated MSA totals and using them as inputs to a regression, how to proceed is less clear cut. I recommend consulting an econometrics text for techniques to deal with measurement error in independent variables, along with the attached 2013 working paper by Solon, Haider, and Wooldridge.
2013_Solon, Haider, Wooldridge_What Are We Weighting For.pdf (174.8 KB)