In All Probability

As a teenager, I had the good fortune to meet Tom Körner. He had written a book, The Pleasures of Counting, that opens with a story about John Snow.

Snow was a 19th-century English physician who painstakingly collected and analyzed vast amounts of data to convincingly argue that cholera spreads through contaminated drinking water, and not, as was once widely believed, from some kind of air pollution.

This struck me as a powerful example of how mathematics, and in particular, statistics, can impact our lives. How many more would have succumbed, had the true cause remained hidden?

Yet today, probability and statistics seem much maligned. Statistics are worse than "damned lies"; they’re "pliable"; one can "prove anything by statistcs except the truth"; they are the means to produce "unreliable facts from reliable figures".

How did this happen? I’m sure these famous quotes were composed mainly in jest, and perhaps referred to shady accounting more than actual calculation. But these days, even the mathematics itself seems suspect:

Statistics is indeed a troubled subject. It turns out some guy named R. A. Fisher is to blame. Fisher had a tragic combination of gifts and flaws that led to today’s erroneous orthodox statistics. (Despite an ever-growing mountain of evidence, Fisher steadfastly refused to believe smoking causes lung cancer. How good could his methods be?)

My undergrad introductory course on probability and statistics followed Fisher’s dogma. As a result, I felt that the methods they taught seemed more like black magic than mathematics. But I was convinced that the lecturer only seemed to be teaching superstitions because my understanding was too shallow, and I concluded I must have a poor intuition for the subject.

Years later, and determined to conquer my weakness in this area, I went back to my textbook. And some other books. I discovered the shocking truth: my textbook is wrong. For once, a crazy conspiracy theory was true and They really were corrupting us all with Their false mathematics.

Epilogue

I heard from Fred Ross that this link got posted to Hacker News. He had this to say:

The underlying theory that justifies most inference (Bayesian, minimax, etc.) is decision theory, which is a subset of the theory of games. Savage’s book on the foundations of statistics has a very nice discussion of why this should be. I learned it from Kiefer’s book, which is the only book I know of that starts there. Lehmann or Casella both get to it later in their books.

The justification for p-value is actually the Neyman-Pearson theory of hypothesis testing. The p-value is the critical value of alpha in that framework. I wrote a couple of expository articles for clinicians going through this if you’re interested.

Jaynes was a wonderful thinker, but be aware that a lot of the rational actor theory breaks down when you don’t have a single utility function. That is true of using classes of prior (see the material towards the end of Berger), or in sequential decision problems (look at prospect theory in psychology, where the overall strategy may have a single utility function, but local decisions along the way can’t be described with one). So the claims in the middle of the 20th century for naturalness of Bayesian reasoning haven’t held up well.

Since then, I’ve done a little more reading, and my views have hardened. I disagree with Ross. Any flaws in Jaynes' unfinished work are more than compensated for by David Mackay, Information Theory.

As Jaynes writes, Cox’s theorem is the underlying justification for Bayesian reasoning, not decision theory. See Chapter 36 of Mackay for a one-liner explanation of what decision theory actually is. (Also, appealing to psychology is suspect because the popularity of frequentism implies humans are often irrational actors!)

Chapter 37 in particular humorously demolishes frequentism/sampling theory. Mackay also proposes an ingenious compromise when you must deal with its fanatics: "from a selection of statistical methods," sampling theorists pick "whichever has the 'best' long-run properties". Thus to sneak Bayesian reasoning past them, simply state you’re choosing the method with the 'best' long-run properties, while being careful to avoid the word "Bayesian". I propose we use the phrase "Mackay’s correction"; for example, "the chi-squared significance test with Mackay’s correction" might satisfy reviewers suffering from frequentism.

Mackay’s favourite reading on this topic includes: Jaynes, 1983; Gull, 1988; Loredo, 1990; Berger, 1985; Jaynes, 2003. Mackay also mentions treatises on Bayesian statistics from the statistics community: Box and Tiao, 1973; O’Hagan, 1994.

Although unrelated to probability, I recommend another book by David Mackay: Sustainable Energy: without the hot air. Again, Mackay clearly explains how to navigate correctly in an area infested by influential charlatans that mislead us with illogical complex arguments.

The phrase "in the middle of the 20th century" calls to mind World War II, when Allied codebreakers applied Bayesian reasoning to break Germany’s Enigma cipher. Their methods "haven’t held up well"? Really? Which side won?

Meanwhile, Fisher’s diehard eugenicist views on miscegenation seem to have mostly fallen out of fashion. This is apppropriate, as sampling theory seems to be increasingly acknowledged as a 20th century practice that hasn’t held up well: see John Ioannidis, Why Most Published Research Findings Are False.


Ben Lynn blynn@cs.stanford.edu 💡