We are a commune of inquiring, skeptical, politically centrist, capitalist, anglophile, traditionalist New England Yankee humans, humanoids, and animals with many interests beyond and above politics. Each of us has had a high-school education (or GED), but all had ADD so didn't pay attention very well, especially the dogs. Each one of us does "try my best to be just like I am," and none of us enjoys working for others, including for Maggie, from whom we receive neither a nickel nor a dime. Freedom from nags, cranks, government, do-gooders, control-freaks and idiots is all that we ask for.
Our Recent Essays Behind the Front Page
Tuesday, April 9. 2013
It's been far too long since I studied, or used statistics other than to read medical journal articles. Everybody talks about Bayesian Statistics nowadays. They are the new old thing, almost 100 years older than Fisher Statistics (Fisher was an interesting fellow).
In my youth, I learned to be always skeptical about any research results, but I am told that running data through Bayesian methods is a good test of data.
Can somebody explain the concept to me in simple English? I don't intend to use it, just to get the ideas (I can do the math, but I want something conceptual for starters.) Most Liberal Arts students learned basic Stats in college, the p and the t-test, etc., but the Bayesian is new to me.
Display comments as (Linear | Threaded)
I think two stats sum up the entire field:
1. Half of humanity is below average intelligence.
When you're wondering why it's so hard to get two people to agree on anything, there's your reason. This is especially true if, by sheer bad luck, both are 'belows'. How would you expect two people of below average intelligence to know much of anything? If they did, they'd be above.
But here's the real sticker:
2. It's estimated that 1 out of 4 people is clinically insane. If the three people closest to you are okay, it's you.
My advice is to purposefully hang out with the 'belows', just to be on the safe side.
Well, I won't demonstrate my ignorance here.
However, a few months ago, I picked up 'Thinking Statisically' by Uri Bram from Amazon. A fun little book (35 pages) that runs you through the concepts.
Flip a fair coin 19 times and it comes up heads all 19 times, make a bet on the next flip.
Classical statistics: Bet on tails, the chances are really good that it can't happen again.
Bayesian statistics: Bet on heads, there is no way that coin is fair.
Classical statistics looks at the raw probabilities and produces predictions based on that. I.e. the chances of a fair coin coming up heads that many times is incredibly low.
Bayesian statistics goes further and adds the known history to guide future state predictions. I.e. the chances of a fair coin coming up heads that many times is low, but the history flips the probability around to say that the chances of tails coming up is incredibly low.
Some of the oldest articles at overcomingbias.com are great intros to the subject.
At the start of the run, neither method has any advantage, as they would both assume a fair coin with 50/50 chances.
The divergence comes as the run progresses. The Bayesian method naturally converges on the correct probabilities where the classical method stays constant until you step in to investigate and correct it.
Give this a shot - it is better than anything that I could think up. So go read this description, as I said, pretty good! (I'll check to insure the link comes through... Sometimes...)
But if you set up the test such as "the sun went down. will it rise again in the next 15 minutes... No! Black marble. How about the next 15 minutes... Oops, another black marble..." Now you have a bag full of black marbles and only one white marble. Your data is pretty screwed up.
That may sound ludicrous on the surface, but sometimes it is difficult to find the proper parameters of past experience. So, as with all statistics, it is wise to know what you are analyzing. Again, simple on the surface.
Most research is bunk.
You must have heard this one:
Researcher to statistician: "Here's some data! Can you do some analysis and give me a number?"
Statistician to researcher: "What number did you have in mind?"
Sorry, but both of your examples are examples of Bayesian statistics. In Bayesian statistics but not frequentist statistics (i.e., what you call classical stats), knowledge of PRIOR trials (prior flips of the coin) is taken into account. In frequentist statistics, much of which was laid out by the eminent mathematician Sir Ronald Fisher, PRIORs don't matter, and each trial (flip of the coin) is independent of every trial that went before and also every trial that comes after. Thus, in the frequentist approach, no matter how many heads preceded the 20th flip, the the chance of a head or tail on the 20th trial is still 50/50 (for an unbiased or fair coin). In the Bayesian approach, the probability of getting a head on the 20th flip is Pr(getting a 20th head) = Pr(getting 19 heads in a row) X Pr(getting a 20th head GIVEN that it's preceded by 19 heads in a row). The latter probability is known as the prior.
That example doesn't go far enough.
After 24 hours, the Bayesian method would have corrected itself to 50-50. I.e the sun being up or down for any 15 min period in any 24 hour period is 50%.
As time went on and more more data points were collected, it would vary from 50% less and less as it converged on the correct probability.
How would that prediction be formulated in classical statistics?
I get it. In classical statistics the likelyhood of a pricinct in Philadelphia voting 100% for Obama is extremely unlikely. In Bayesian statistics the likelyhood of 100% vote for Obama is very good.
One way to think of Bayesian statistics brings to the table is the weak syllogism:
All men are mortal
Socrates is mortal
Hence it is more likely that Socrates is a man
The Baysians would claim that this is closer to how people actually reason. In Baysian statistics the probability is interpreted not as a frequency of occurence (frequentists), but rather a measure of ones knowledge of the situation. As such there is a subjective component to it that reflects the individuals knowledge and presuppositions (priors). A measurement in consequence is not just one event in a long sequence of events whose frequency one seeks to measure, but rather an input that is used to update ones knowledge using Bayes rule, resulting in improved knowledge (posterior). In practice there are a number of measurements and the posterior becomes the new prior to be updated on the next measurement.
The method is well suited to track things in realtime and also lends itself to complicated models that are in effect evolved step by step. A tricky part is picking the starting prior when symmetry (think dice) doesn't come into play. It can also be computationally expensive if the model is all complicated.
Ahem - you're all wrong - sadly and unfortunately.
Any statistical sample or sampling technique is inherently flawed as bias by the observer is built into any result thus making the whole exercise unstable and useless. Outliers on the curve will always prove to be the exception to the rule and the rule is thus always flawed.
You're welcome - I live to serve.
In simple English:
Bayesian statistics are based on probabilities that events may happen.
You can choose sets of information that are mutually exclusive and exhaustive and the probabilities of these events would sum to 1.
Such a set can be expressed as P(E)+P(not_E) = 1.
An example where this model is more useful than in random trials (which are well described by Bayes' contemporary, Bernoulli) is "what is the probability that the kids will play football this afternoon?"
This question has numerous dependencies that influence the outcome, that can likewise be assigned probabilities of occurrence. These dependencies are formulated as
"what is the probability the kids will play football GIVEN THAT it is raining (or sunny)?"
There are other dependencies that we can easily imagine, such as "what is the probability the kids will play football GIVEN THAT it is raining (or sunny), GIVEN THAT they have finished their homework?"
It is quite a practised art to map out these relations and cross dependencies for event monitoring reliably. If done well, it can provide an astonishing insight into how events and outcomes can be monitored. Most importantly, it always gives you a component of unknowns or new events that could happen: large maps can be drawn down to the very finest resolution in some major problems. In my opinion, it is this structured cartographic layout of possible events that gives Bayesian techniques their value in problem assessment.
As for seeding the actual probabilities between the dependencies, as in all known branches of statistics, these can be broken down into three categories:
Past data (empirical).
A priori (justifiable logic).
A mixture of all three needs to be used in practice. The first is useful for steady data, but even weather data is highly variable and so you are likely to need to mix the other two. In publications, people will always claim to have used the a priori. Other people in practice use the third method and get satisfactory results while lacking a satisfactory "scientific" explanation.
Hope this helps!
All the best,
Pretty close, but Bayesian statistics would not predict 100% one party turnout until it had seen the results of several votes. You have to remember it starts with the exact same prediction(priors) as classical statistics. Its strength is that it is able to start correcting itself fairly quickly after seeing real samples.
Re: statistical sampling
That is exactly why Bayesian statistics has strength. Of course the observer's bias is built into the priors - that is a given in Bayesian statistics. It was specifically designed to self-correct given enough samples. The difficulty comes in making sure it is given a typical sample set since that is also unknown.
Outliers being an exception is a tautology regardless of the statistical method being used. They do not invalidate the strength of the system, but highlight the strengths and weaknesses of the system. With some data sets, classical statistics handles the outliers better. With most data sets, Bayesian statistics handle the outliers better.
Baloney - any statistical system is only as good as it's sample(s). Samples are biased toward what ever objective is being sought. It's an inherent flaw in any statistical system. There is no "tautology" in the statement that the outliers prove the systems are worthless because that is exactly what outliers prove - they cannot be accounted for by any system.
Statistics also prove to be only a snapshot in time and an out of focus one at that simply due to the observer effect. There isn't any system or bits of system that allow for a statistical sample to prove true over time.
Want proof? One word - Sabermetrics. 29 separate individual statistical measurements of the worth of any given baseball player which, as a general rule, fail to accurately predict any given players performance at any given time.
Bayesian or classical, seeking a mean or median to future predict an outcome, especially concerning life, is problematic. Read Dr. Steven J. Gould's essay - "The Median is not the Message."
Many years ago I worked in a highly technical field before it was a common household word. My very good friend who also worked in the field with me had trouble reading and to my knowledge never wrote anything. When the system went down he was the go to guy. He would get calls in the middle of the night from the frantic shift supervisor to come in and fix this critical system. I doubt he ever read a book, he came across as a hick, no one would have consdered him in the above average half. Another co-worker was a genius and actually worked in the budding field for a year or so before entering this highly specialized segment of the technical industry. He was or should have been an over achiever. He was in fact useless and was unable to put it together to fix problems in the two minutes time limti before the system was considered disabled/down. To my knowledge he never fixed a problem and simply took up space. There was no doubt he was a genious anyone would realize that in minutes after talking with him just as there was no doubt my friend was functionally illiterate. So what explains that? It has been my experience in this field that the most promising and clearly intelligent were abject failures and often it was the more unlikely people who were amazing. Go figure.
Go to Dr. william Briggs website and download his free text book on statistics. It's all about Bayes.
Agreed. Matt Briggs is well versed in helping Drs. With Bayes and other stat questions. You can email him directly too.
At a deeper level, classical statistics is a Platonist philosophy. There is a real underlying distribution and truth that only awaits our calculation.
They will say "This is the distribution!"
Bayesian statistics says, we'll never know what the "truth" is, we can only see what opinions we can generate and verify. That's why Bayesians start with a guess, check it against the data, try another guess, and so on, until the costs of further guessing are no longer worth the effort on the available data.
They will say, "This is the best we can do!"
BTW, Gaussian distributions are what people usually think of as to statistics but there is another one - Poisson. It's used when you have only a few (