We are a commune of inquiring, skeptical, politically centrist, capitalist, anglophile, traditionalist New England Yankee humans, humanoids, and animals with many interests beyond and above politics. Each of us has had a high-school education (or GED), but all had ADD so didn't pay attention very well, especially the dogs. Each one of us does "try my best to be just like I am," and none of us enjoys working for others, including for Maggie, from whom we receive neither a nickel nor a dime. Freedom from nags, cranks, government, do-gooders, control-freaks and idiots is all that we ask for.
An RCT is a "randomized controlled clinical trial."
We have discussed the scientific fallacy of "data mining" here in the past in which, instead of testing an hypothesis (aka the Scientific Method), the researcher simply asks the computer to find any correlations in the mountain of collected data. That is not science. This is typically done when a researcher has a mound of data which did not support his hypothesis. So as not to waste it, he asks the computer to find something else in it.
In any mountain of data, some correlations can be found if only by laws of randomness - see the legal hoax of so-called Cancer Clusters.
Often enough, when you read "Study says...", you are reading a report from data mining. Our readers know that a statistical correlation often - or usually - means nothing, but data-mining "information" is non-information. Generally speaking, newspaper reporters never passed Statistics 101. (I did, but found stats difficult to explain to innumerate juries who even get confused by basic algebra.)
Even medical professionals get taken by this growing technique. It's most common when secondary studies use the database from participants in a randomized controlled trial to look for correlations not to scientifically test a hypothesis, let alone one the original trial had been designed to fairly test. Carefully controlled clinical trials are concerned with causes and effective treatments. In contrast, multivariate analyses of large databases, with their statistical manipulations and regression computer modeling, are statistics. Statistics is about correlations. It's not biological research.
Well, I have a lot to say about data mining, but too much for a comment. Maybe I'll post about it. However, it's simplistic, and inaccurate, to call data mining a scientific or logical fallacy. It's a tool, one which yes, can be misused, but that it can be misused does not invalidate it as a powerful tool.
Anyway, too much for a comment. Maybe a post, later.
I think the logical fallacy is of course mistaking correlation for causation. However, if you use the data mining to create new hypotheses, it can be very useful.
I have used it in looking for process effects in printing computer circuit test patterns with some success but it requires designing new experiments to test your new hypothesis. This is fairly simple when compared to biological systems however, so I understand your caveat.
H-m-m-m . . . sounds like Paul Ehrlich using the original "Gaia" dataset to suggest that the Earth was cooling (Late 1950's, early 1960's, methinks), but later others "mining" the same data to prove global warmening.
What I though first, B, when I read the Blue Box, was the epidemiologists, who (Very roughly, here) seek evidence of correlation in historical data. AFAIK, the only causation that can be laid at their feet, is the link between smoking and lung cancer; a good one, to be sure. Cured me of epidemiology.
I call those "a hypothesis looking for a study". Data mining like that is a great way to find a nice hypothesis to test. But it disgusts me that people publish them as if they actually did a real study. But then, I was so disgusted with science and academia that I quit it after my first year of grad school.
Yes! Not in itself conclusive of anything, but too often published as "proof" rather than "Interesting, let's study it."
Decades back I gave up on this sort of thing, not backed up by any further study, when an attempt was made to suppress Playboy because "statistics show 75% of convicted murderers" had looked at pr0no. To which my response was, and remains, "100% of all murderers have ingested di-hydrogen monoxide."
It's as simple as that. Data mining can be a very useful tool if correctly evaluated, sifted and the results properly applied. Unfortunately, properly applying the data leads to fudging if the expected results don't quite agree with the theory being tested.