We have rreported so many scientific frauds in the past couple of weeks, I thought I would highlight some commonly-used "data-management" tricks designed to dishonestly influence people.
1. "Clustering." We have all heard about cancer clusters - Why does my town have triple the breast cancer of towns two miles away? There must be someone I can sue about this. Such claims have an emotional appeal, but they are nonsense. Random distribution is not even - it is uneven. Just try flipping a quarter, and you will get little runs of tails. Clustering is a natural effect of randomness, but trial lawyers are always busy trying to track them down: they can get rich before anyone figures out the game.
2. "Cherry-picking." Cherry-picking is a frankly dishonest form of data presentation, often used by newspapers to create alarmist stories about the economy, the environment, food safety, etc. It fools people without some decent science education. What it entails is combing through, say, 60 pieces of data, and then using the three points that support your argument, and ignoring the rest. Presenting random changes as meaningful facts is a lie. Environmentalists use this all of the time, as do other agenda-driven fact-handlers. A casual use of this fallacy is characteristic of The New York Times typical headline: Despite Good Economic Statistics, Some Are Left Behind - and then they scour NYC to find some single black mom in the Bronx who cannot support her kids - and she becomes the "story".
3. "Anectdotal evidence." The above example could also be termed "anectdotal evidence." If you look around, you can always find an exception, a story, and example - of ANYTHING. But anectdotes are compelling, and Reagan used them to the best effect. And how about those swimming Polar Bears! (I always thought they liked to swim.)
4. "Omitted evidence". You tell me how common this is! A first cousin of Cherry-picking, Omitted Evidence is also a lie. All you do is ignore the evidence and data that disagrees with your bias or your position. Simple.
5. "Confirmation bias". People tend to remember evidence which supports their opinion, belief, or bias, and to dismiss or forget evidence which does not. It's a human frailty. Humans have to struggle to be rational.
6. "Biased Data". "A poll at a local pre-school playground in Boston at 2 pm today indicated that 87% of likely voters will vote for Obama." Picking your data sources, like picking the questions you ask, can determine your results with great accuracy. As pollsters always say, "Tell me the answer you want, and I will design the question."
7. "Data mining." Data-mining is used by unscrupulous academics who need to publish. Because it is a retroactive search for non-hypothesized correlations, it does not meet criteria for the scientific method. Let's say you have 10,000 data points from a study which found no correlation for your hypothesis. Negative correlation studies are rarely published, but you spend a lot of time collecting it - so you ask your computer if it can find any other positive correlations in the data. Then you publish those, as if that was what you had studied in the first place.
Image: two good varieties of cherries for picking; Stella on the left, Lapins on the right, from Miller Nurseries
It was clear to me that the point Michael Crichton made in his speech we linked was that linear thinking does not accurately describe the world, or predict events.Indeed it does not. Linear thinking is the domain of dangerous oversimplification and sito
Tracked: Feb 06, 06:11
Back in the days when I was involved with research, I saw some iffy research once as a junior associate. The pressure to come up with some result was powerful, and people kinda sorta convinced themselves that it meant something.Here's how it worked: You s
Tracked: Oct 02, 12:04