Flash crash of the rural population


Dear Folks–

I have attached (or tried to, hope it works) a spreadsheet containing data from the Case Count View of the IPUMS-CPS variable METAREA. If you scroll down the left-hand edge to about row 500 you will see a stacked area graph of METAREA counts divided into four categories: identified metro, unidentified metro, non-metro (which I take to be rural – not sure what the N.I.U. here means. It’s too big a number to be an NIU by standard definition), and missing. These are percentages and add to 100 (and the do – they are not forced to by some spreadsheet operation).

I was struck by the downward spike (in yellow) in the share of “Unidentified Metropolitan” in 1977. Because this is a stacked graph and the high end does not change much, this represents a near-doubling of this value. Actually, it means that the count for rural people (or families?) in 1977, which is between 29 and 32 percent for the five years preceding and eight years following 1977, is zero in 1977, and all of that value seems to have been reported in unidentified metropolitan instead.

Since it is clear that the rural population of the U.S. didn’t disappear for a year, this would seem to be an outlier in the strictest sense, and I wonder if you know anything about it.

OK, you don’t have a spreadsheet, and I am pissed about it. It is at odds with your mission as an open-government, open-data nonprofit to limit access and interaction with you to proprietary formats, when there are free, open-source equivalents with substantial market share and superior performance – in this case LibreOffice’s spreadsheet Calc, the OpenOffice spreadsheet (also called Calc), and GNOME’s Gnumeric. All three of these programs are based on the OASIS Open Document Format, like the DDI format IPUMS uses an open-source standard set by an international nonprofit consortium. The open-source programs may be a bit less versatile, but unlike Excel, Calc and Gnumeric consistently get the right answer, and so are more suitable for reproducible research and science more generally.

And, they are enormously less likely to harbor macro viruses, so security concerns would suggest that you allow information exchange in the secure free open-source reproducible format and ban interaction through the insecure proprietary monopoly-priced expensive buggy inaccurate occult-coded format. Or at least allow the better format, for gosh sakes. As between Microsoft and LibreOffice’s network of volunteers, which is providing software “for Good, never for Evil”?

Anyway, you can see the anomaly on the METAREA page, Case Count View, Codes 9997 and 9998 for 1977.



First of all, apologies for the spreadsheet attachment issue. We are still getting used to our new forum platform and failed to realize that Open Document Format file extensions (.odt, .ods, etc.) were not on the default list of allowable attachments (clearly the Spreadsheet Illuminati are to blame for this as well). You should now be able to attach these files, and if you notice any further difficulties with attachments feel free to let us know in the #feedback category.

Now onto your question about METAREA. The IPUMS CPS team has looked into the issue you have identified and have confirmed that it is in fact an error, which means you are entitled to your vary own IPUMS Mug! please email ipums@umn.edu with an address where we can ship your coveted prize!

In the mean time, the METRO variable should provide the information you are looking for.

I hope this helps. And thank you for helping us maintain these important data resources.