Maggie's FarmWe are a commune of inquiring, skeptical, politically centrist, capitalist, anglophile, traditionalist New England Yankee humans, humanoids, and animals with many interests beyond and above politics. Each of us has had a high-school education (or GED), but all had ADD so didn't pay attention very well, especially the dogs. Each one of us does "try my best to be just like I am," and none of us enjoys working for others, including for Maggie, from whom we receive neither a nickel nor a dime. Freedom from nags, cranks, government, do-gooders, control-freaks and idiots is all that we ask for. |
Our Recent Essays Behind the Front Page
Categories
QuicksearchLinks
Blog Administration |
Sunday, January 22. 2012A repost: Fallacies of the Week: A few fun Data FallaciesWe have rreported so many scientific frauds in the past couple of weeks, I thought I would highlight some commonly-used "data-management" tricks designed to dishonestly influence people. 1. "Clustering." We have all heard about cancer clusters - Why does my town have triple the breast cancer of towns two miles away? There must be someone I can sue about this. Such claims have an emotional appeal, but they are nonsense. Random distribution is not even - it is uneven. Just try flipping a quarter, and you will get little runs of tails. Clustering is a natural effect of randomness, but trial lawyers are always busy trying to track them down: they can get rich before anyone figures out the game.
2. "Cherry-picking." Cherry-picking is a frankly dishonest form of data presentation, often used by newspapers to create alarmist stories about the economy, the environment, food safety, etc. It fools people without some decent science education. What it entails is combing through, say, 60 pieces of data, and then using the three points that support your argument, and ignoring the rest. Presenting random changes as meaningful facts is a lie. Environmentalists use this all of the time, as do other agenda-driven fact-handlers. A casual use of this fallacy is characteristic of The New York Times typical headline: Despite Good Economic Statistics, Some Are Left Behind - and then they scour NYC to find some single black mom in the Bronx who cannot support her kids - and she becomes the "story". 3. "Anectdotal evidence." The above example could also be termed "anectdotal evidence." If you look around, you can always find an exception, a story, and example - of ANYTHING. But anectdotes are compelling, and Reagan used them to the best effect. And how about those swimming Polar Bears! (I always thought they liked to swim.) 4. "Omitted evidence". You tell me how common this is! A first cousin of Cherry-picking, Omitted Evidence is also a lie. All you do is ignore the evidence and data that disagrees with your bias or your position. Simple. 5. "Confirmation bias". People tend to remember evidence which supports their opinion, belief, or bias, and to dismiss or forget evidence which does not. It's a human frailty. Humans have to struggle to be rational. 6. "Biased Data". "A poll at a local pre-school playground in Boston at 2 pm today indicated that 87% of likely voters will vote for Obama." Picking your data sources, like picking the questions you ask, can determine your results with great accuracy. As pollsters always say, "Tell me the answer you want, and I will design the question." 7. "Data mining." Data-mining is used by unscrupulous academics who need to publish. Because it is a retroactive search for non-hypothesized correlations, it does not meet criteria for the scientific method. Let's say you have 10,000 data points from a study which found no correlation for your hypothesis. Negative correlation studies are rarely published, but you spend a lot of time collecting it - so you ask your computer if it can find any other positive correlations in the data. Then you publish those, as if that was what you had studied in the first place. Image: two good varieties of cherries for picking; Stella on the left, Lapins on the right, from Miller Nurseries Comments
Display comments as
(Linear | Threaded)
Thanks. Nothing new - but it's always nice to be able to NAME the fallacies we read daily.
Other than #7, all these can be learned by being a baseball statistics guy. Bill James was not the best of the sabermetricians, but he was a good populariser and gave examples of things fans with any familiarity with numbers could understand: too often in discussions, "he's a great clutch hitter" was seldom an intentional misrepresentation, cherry-picking, but was often anecdotal or confirmation bias.
Relatedly, I would add "inadequate sample size" to your list. Of course, not everything so labeled a fallacy is necessarily a fallacy. On some blog about Thugo Chávez's alleged GREAT accomplishments in improving health care in Venezuela, I brought up the record of the Chávez regime in improving infant mortality.
The record of the Chávez regime in improving infant mortality is about average compared to the rest of Latin America, which is a pretty good indication that the claims about GREAT health care progress under the Thugo Chávez regime are nothing but hot air. The responses of the PSFs who were supporting Thugo were that I was CHERRY PICKING data. Au contraire, infant mortality is usually used as the gold standard in measuring progress in health care in the Third World. [For decade upon decade, Fidel's Fans repeatedly used infant mortality statistics in trumpeting how good Fidel was for Cuba. Their doing so is a good example of the fallacy of omitted evidence, on at least two counts. 1) Infant mortality in Cuba in the 1950s was one of the best in Latin America, and was better than some countries in Europe. 2) The Pinochet regime made better progress than Castro's Cuba in reducing infant mortality, yet Fidel's Fans never point this out- nor are they probably aware of it. I once brought this up to a crowd of Sandalistas in 1990, after Daniel Ortega lost the election to Violeta Chamorro. The response was that Castro didn't "get the help" that Pinochet got. As if the USSR were not Castro's sugar daddy for 30 years!] [PSF= Pendejos Sin Fronteras] Good points. Remember also that different countries attach different meanings to "infant" and "mortality." For example, in some countries, a preterm birth does not qualify as a live birth until the expected, normal term date is passed. If the infant dies before this normal 38-41 week date, then the birth is not included as an "infant mortality." Liberal professors are using tricks like this to convince students that the U.S. infant mortality rate is "26th worst in the world." And, "See how wonderful the medical system in Cuba is? Look how much better they are than the U.S." The World Health Organization keeps some of the stats, and is a good source. They don't always explain that some countries, like the U.S., try (and succeed in far greater numbers than most other nations) to save preterm babies, and do count them.
The appearance of cancer clusters has long been a favored technique of activists seeking to discredit commercial nuclear power.
They will typically claim that clusters of cancers appear downwind of a nuke once the plant starts operating. Clusters that appear elsewhere, not geographically related to a nuke are ignored. The earliest case was with the Shippingport reactor in Pennsylvania back in the late 50s. They are still doing it and the press is still reporting it. Like Mark Twain said, a lie can travel around the world before truth can get its boots on. |
It was clear to me that the point Michael Crichton made in his speech we linked was that linear thinking does not accurately describe the world, or predict events.Indeed it does not. Linear thinking is the domain of dangerous oversimplification and sito
Tracked: Feb 06, 06:11
Back in the days when I was involved with research, I saw some iffy research once as a junior associate. The pressure to come up with some result was powerful, and people kinda sorta convinced themselves that it meant something.Here's how it worked: You s
Tracked: Oct 02, 12:04