MIGSTA1 and WHYMOVE consistency

Hello! A follow up to this question:

I’m using MIGSTAT1 to only look at people who moved from a given state (state A), and then tallying the WHYMOVE reasons to find a most common one.

However, the added number of people who gave each reason is much larger than the ACS 5-year estimate of how many people moved from state A to other states (980,000 compared to 330,000). Why is there such a large discrepancy? How can I gauge how representative of people who moved from state A these responses are?


Oh, I just realized that I was not using the MIGRATE1 variable to only look at out-of-state movers. Now the number is closer to 127,000, which considering the total, seems too good of a sample to be true?

Here are the variables I’m using. I’m also trying to just limit it to 2018 responses.

Also, is there a reason some reasons are not listed at all for some states? (i.e. Michigan, under the above parameters, doesn’t list “retirement” as a reason for moving).

I think you may be confusing the estimates you produced with the sample size. When you create your table in SDA, you can choose to display estimated numbers (using weights), unweighted numbers, or both. The 127,000 you mentioned is an estimate of the total number of movers from Illinois to another state. But the actual number of survey respondents who gave that answer was only 61. Regarding the missing categories of WHYMOVE, this is a similar situation. Only categories with any respondents will be listed. Most categories of WHYMOVE with MIGSTA1=Michigan have only a small number of respondents, so if there were few retirees who moved from Michigan it could definitely be the case that they wouldn’t be sampled by CPS at all in a given month. If you expand the number of years considered, you’ll see more categories show up.

Ah, yes, I was mistakenly using the estimated numbers as the total surveyed. 61 seems rather low to base too many conclusions on, even when using the confidence intervals for each percentage. However, I know CPS has cautioned against comparing responses between years because of wording changes – does the data used for the SDA account for those differences?

You can check the variable’s comparability tab to see if the variable is comparable over time. Although this does not pick up every wording change, but rather category changes. You can check precise wording changes for many variables using the Questionnaire Text tab. I think you don’t have too much to worry about for your variables, so pooling years should be fine.