The Soul of Modeling, Probability & Statistics
Being something of a beginner in the art of statistical analysis, I thought this book on the philosophical and conceptual underpinnings of statistical methods would be instructive, and I was right. I learned so much I’m not sure I want to learn any more.
In a nutshell: Briggs is critical of most of the standard apparatus of statistical methods, both technical and interpretive. Hypothesis testing, regression, data smoothing, quantification of everything, and, above all, p-values he condemns to perdition. The problem is not that such methods have no value, but that they are widely misunderstood and misapplied, with the result that the conclusions drawn from statistical analyses are often either simply wrong or the uncertainty in those conclusions is underestimated (and by an unknown amount). He gives many examples of ways in which standard techniques lead to spurious “significant” results.
By criticizing standard statistical methods, one might get the impression that Briggs’ is a lone voice crying in the wilderness, but he has plenty of citations to offer for most of his arguments. He belongs to an alternate, minority, but not negligible tradition.
Some of the important points he makes:
Probability is logical. Logic concerns relationships between propositions, and so does probability, except that in the latter case the logic is extended to propositions the truth of which is uncertain. This point was made lucidly and rather beautifully by Jaynes, and reading Briggs has made me want to return to that book to read more of it.
Probability is not a cause. Probability can tell us about correlations, but nothing at all about causes. The habit of inferring causes from statistical correlations, absent a corresponding causal model, is a bad habit that leads many astray. In general, uncertainty reflects our ignorance of causes rather than our knowledge of them.
Probability is conditional. Probability statements are always conditional on a set of premises. This is no such thing as Pr(X), but only Pr(X|Y) — that is, the probability of X given some set of premises Y. If the premises change, the probability of X will, in general, change. Thus Briggs, while not quite a Bayesian, does think the Bayesians have it over the frequentists when it comes to the debate over whether probability is objective (ie. out there) or subjective (ie. in the mind). Probabilities reflect the uncertainty in propositions given what we know; they do not exist outside our minds, and they change when our knowledge changes. A corollary is that one should never say, “X has a probability of Z”. Nothing has a probability. Probability does not exist. One should only say, “Given premises Y, the probability of X is Z.”
Probability is often not quantifiable. If we know “Most people like ice cream and Sue is a person”, the probability that Sue likes ice cream cannot be naturally or unambiguously quantified unless the meaning of “most” is clarified. Moreover, it is often a mistake to force probabilistic arguments into a quantified form. Briggs argues that the habit of doing so (as with “instruments” for assessing subjective attitudes about politics or emotional responses to stimuli, for instance) often leads to misleading results and promotes the vice of scientism.
Statistical significance is not objective. No probability model can tell one whether a given probability is significant or not. This is an extra-statistical, and often an extra-scientific, question. Whether it is judged significant is a matter of prudential judgment based on the specific question at issue and the decisions to be made about it. Thus he would like to disrupt the “turn the crank” model of statistical analysis in which “significant” results pop out of the sausage-maker, returning such questions to spheres of deliberation and judgment.
Probability models should be predictive. Briggs’ principal constructive suggestion (apart from shoring up our understanding of what probability is) is that statistical models should be predictive. They should state their premises in as much detail as possible, and should predict observations on the basis of those premises (taking into account uncertainties, of course). If the models fail to predict the observables, they are not working and should be amended or scrapped. As I understand it, he is proposing that fields which lean heavily on statistics should, by following his proposals, become more like the hard sciences. True, progress will be slower, and (acknowledged) uncertainties larger, but progress will be surer and causes better understood.
Briggs has some fun pointing out common fallacies in statistical circles. There is, for instance, the We-Have-To-Do-Something Fallacy, in which a perceived imperative to do something about something (usually something political) leads to the employment of some defective or fallacious statistical method, the defectiveness or fallaciousness of which is then ignored. Or the Epidemiologist’s Fallacy, in which a statistician claims “X causes Y” even though X was never measured and though statistical models cannot in any case discern causes. (This fallacy is so-called because without it “most epidemiologists, especially those in government, would be out of a job”.) Or the False Dichotomy Fallacy, which is the foundational principle of hypothesis testing. Or the Deadly Sin of Reification, whereby statisticians mistake parameters in their statistical models for real things. And so on.
Much of this might seem rather obvious to the uninitiated. I’m not an adept of the standard techniques, so I was at times a little puzzled as I tried to discern the particular bad habit Briggs was criticizing. But, as is increasingly appreciated (here and here, for instance), the use and abuse of the standard techniques have led wide swathes of the scientific community into error, most commonly the error of over-certainty, which is actually an uncertainty about what is true. An audience for this book clearly exists.
Were his recommendations to be followed, he argues that the effects would be
a return to a saner and less hyperbolic practice of science, one that is not quite so dictatorial and inflexible, one that is calmer and in less of a hurry, one that is far less sure of itself, one that has a proper appreciation of how much it doesn’t know.
But, on the other hand, it would reduce the rate at which papers could be published, would make decisions about significance matters of prudential judgment rather than scientific diktat, and would make scientific conclusions more uncertain. He is fighting an uphill battle.
Briggs is an adjunct professor at Columbia, and has done most of his scientific work in climate science (and is, as you would expect, skeptical of the predictions of statistical climate models, which provide a few of his case studies). He seems to be something of an atypical academic: this book, for instance, includes approving reference to Aristotle, Thomas Aquinas, and even John Henry Newman (whose Grammar of Assent he cites as an example of non-quantitative probabilistic argumentation). It’s quite a rollicking read too. Briggs has a personality, and doesn’t try to hide it. Personally I found the tone of the book a little too breezy, the text sometimes reading almost as if it were transcribed lecture notes (I make no hypothesis), but overall the book is smart and clear-eyed, and I’m glad to have read it. Now back to Jaynes.
I found a good video which illustrates the problem with relying on p-values to determine statistical significance. When I consider that many of the findings of the social sciences are based on this criterion I’m not sure whether to cringe or weep. No wonder there is a replication crisis. Witness the dance of the p-values:
Here is a short video illustrating why it is reasonable to doubt the putative findings of many (and perhaps most) published research papers employing statistical methods. This argument and others are set forth in detail by Ioannidis.