Hi,
I am new to the CPS data. I’m putting together aggregate income data for the City of Chicago 1967 to present on an annual basis. Can someone review my steps (below) in case I am missing something important? Also, my 2019 estimate is $386,109,662,828. The ACS 5-year is 96,775,435,900. The 5-year is ~104 billion. Is this due to an error on my part, or is this normal for CPS to differ that much?
First, I will select out Chicago using the METFIPS and INDIVIDCC variables – checking with the technical documentation in case anything changes.
Second, since INDIVIDCC is a household variable I will only be able to get the statistics for household income. I will use the ASECWTH weight and PERNUM = 1. I am using analytic weights in stata.
I meant to say: $96.8 bn for the 5-year and roughly $104 bn for the 1-year ACS.
I was not able to replicate your issue using the CPS 2019 ASEC data, but instead got a number very similar to your estimates using the ACS. The code I ran was:
total hhincome if pernum==1 & metfips==16980 & individcc==1 & hhincome<9999999 [pw=asecwth]
One possibility of what might be skewing your results is that you did not drop the observations for HHINCOME that are not-in-universe (HHINCOME = 9999999). I hope this helps.
Thank you! I re-ran the numbers for aggregate personal income. My estimate is about $24 million above the 5-year ACS aggregate income, but my confidence interval falls within the ACS margin of error!
I appreciate the help.
One follow up question. I followed the same procedure for 2010, but the CPS estimate is about $ 20 billion below the ACS 5-year. The ACS 1-year is about 69 billion. Any thoughts on this? I put the results from the ACS and my results below, along with my code below, in case the problem is on my end. Thanks!
ACS 5-year aggregate personal income: 73,392,412,400 moe: 544,365,331
CPS: 53,805,259,431, SE: 2,531,396,853, CI: 4.88e+10 5.88e+10
Here is my stata code:
total inctot if metfips==16980 & individcc==1 & inctot<999999999 & year == 2010 [pw=asecwt]
Taking a look at the differences in INCTOT between the ASEC and CPS over the past decade gives us a better view of the trends.
For the ASEC I ran: total inctot if metfips==16980 & individcc==1 & inctot !=999999999 & year == 2005 [pw=asecwt].
For the ACS I ran: total inctot if city == 1190 & inctot !=9999999 & year == 2009 [pw=perwt]
The Census Bureau introduced redesigned income questions in 2014 because ASEC income questions had been previously deemed inaccurate. However, only 3/8 of the sample received the new income questions and 5/8 received the old questions. For this reason, 2014 estimates should be generated by restricting the sample to HFLAG = 1.
The 2016-2019 error looks like a sampling error between the ACS and ASEC. 2009-2013 also look like sampling errors, but with a different mean due to the difference in income questions. That still leaves 2014 and 2015 low estimates unexplained for. I have followed up with our CPS team on this and will let you know when I hear back from them. For now, my recommendation is to use estimates from the ACS. If you are going to use estimates from the CPS, I would analyze the years before 2014 separately from the years after 2015.
Thank you so much for putting this together! It’s very helpful to see. It is making me think that I should stick with the decennial census (1970, 1980, 1990, and 2000) and 1 and 5-year ACS for other years. I appreciate you putting so much work into this, it saved me a huge headache down the line.
Certainly! I’m glad it was helpful. This does highlight how sensitive the data we use is to the specific procedures surveys employ. I will try to see if we can get better documentation on these changes.
Feel free to reach out on any further questions you have!