Natural Selection of Bad Science. Part II

Close-up of a male character in a red uniform with a concerned expression, accompanied by the quote 'I can't change the laws of physics.'

From Climate Etc.

By John Ridgway

A graph titled 'Global warming EPIC FAIL' comparing 73 climate computer models to observational data, showing model projections (in red) diverging from reality (in black) from 1975 to 2020, with data points for average satellite and balloon observations.

In an earlier essay [1], I explained how positive feedbacks can lead to potentially problematic scientific monocultures. I also acknowledged that poor research design and data analysis had become commonplace within the behavioural sciences, largely as a result of a ‘natural selection’, driven by the career enhancement that comes with publication. However, I did not question whether there were any reward structures within climate science that may also have led to a natural selection for bad statistical practice.

Because modelling is not measurement

Well, in fact the evidence is that there is such a system of reward, and an early indication of the problem can be found in a quote reported by Professor Jeroen Pieter van der Sluijs, of the University of Bergen. It is alleged to come from a modeller working on the IPCC’s First Assessment Report (FAR):

“What they were very keen for us to do at IPCC [1990], and modellers refused and we didn’t do it, was to say we’ve got this range 1.5 – 4.5°C, what are the probability limits of that? You can’t do it. It’s not the same as experimental error. The range is nothing to do with probability – it is not a normal distribution or a skewed distribution. Who knows what it is?” [2]

I’ve no idea whether this individual finally gave in to the demands or, preferring not to be swallowed up by a mono-culture, decided to get out of the business. Either way, the problem didn’t go away and we now live in a world where it is common practice for climate scientists to confuse epistemic and aleatory uncertainty [3] by trying to treat the spread of model outputs as if dealing with a measurement problem that can be characterised using probability distributions. It’s possible that some climate scientists are simply unaware of the issue, but many others deliberately ignore it because they recognise the expedience of pretending that the epistemic uncertainties involved can be analysed as if they were aleatory. And to a large extent they do it because it appeases the policy makers. As Gavin Schmidt puts it:

“Model agreements (or spreads) are therefore not equivalent to probability statements. Since we cannot hope to span the full range of possible models (including all possible parameterizations) or to assess the uncertainty of physics about which we so far have no knowledge, hope that any ensemble range can ever be used as a surrogate for a full probability density function of future climate is futile…Yet demands from policy makers for scientific-looking probability distributions for regional climate changes are mounting, and while there are a number of ways to provide them, all, in my opinion, are equally unverifiable.” [4]

There is a good reason why they would all be equally unverifiable, and it has to do with the assumptions that would have to be made. As professor Eric Winsberg of the University of South Florida explains:

“Ensemble methods assume that, in some relevant respect, the set of available models represent something like a sample of independent draws from the space of possible model structures. This is surely the greatest problem with ensemble statistical methods. The average and standard deviation of a set of trials is only meaningful if those trials represent a random sample of independent draws from the relevant space—in this case the space of possible model structures. Many commentators have noted that this assumption is not met by the set of climate models on the market…Perhaps we are meant to assume, instead, that the existing models are randomly distributed around the ideal model, in some kind of normal distribution, on analogy to measurement theory. But modeling isn’t measurement, and so there is very little reason to think this assumption holds.” [5]

Climate scientist Theodore G. Shepherd was even more blunt when he wrote:

“The uncertainty in the climate response to forcing is conceptually very different. It is not a property of the physical climate system; rather, it is a property of a state of knowledge, or degree of belief, and it can be reduced as knowledge improves. In contrast with aleatoric uncertainty, which is objective, such epistemic uncertainty is subjective. Therefore, treating epistemic uncertainty as if it were aleatoric, with a focus on the multi-model mean as a best estimate, has no epistemological justification. This has been recognized for some time, but the practice continues to be normative (e.g. as in figure 1). It is interesting to consider why this is so, since, in most areas of science, the essential distinction between systematic and random sources of uncertainty is well recognized.” [6]

Persistent, normative practice that is well known to be wrong is a strong indicator that we are dealing here with a natural selection for bad science. Gavin Schmidt can blame this on the policy makers but I’m afraid the scientists must take the responsibility. They could refuse to cooperate (as was suggested by the FAR modeller back in 1990), but nowadays most won’t refuse because they have careers and prestige to think about. Climate scientists have learnt that if they want to sit at the big table with the policy makers they have to ignore any misgivings they might have and instead give the policy makers what they want, which is a pseudo-aleatory analysis that satisfies because it looks like the techniques learnt in school-day statistics lessons.

By way of mitigation, I should point out that many of the uncertainty analyses that are tackled by climate scientists involve an admixture of both aleatory and epistemic uncertainties, and the extent to which the former may dominate could lend a certain degree of legitimacy to the application of techniques developed to deal with the stochastic. However, this hardly condones the normative practices within climate science. Climate scientists should understand as much as anyone the need to isolate aleatory and epistemic uncertainties, prior to their propagation using respectively appropriate statistical techniques [7]. There really is no excuse for this not being done. Whichever way one looks at it, failure to respect this important distinction has been allowed to continue, largely because showing such respect is unnecessary when it comes to gaining influence within climate science.

The importance of the issue: An AI analysis

I am not making a pedantic point here. Inappropriately treating epistemic uncertainty as if it were aleatory is known to lead to incorrect conclusions. In structural engineering this problem is quite notorious:

“It is shown that for proper formulation of reliability, careful attention should be paid to the categorization (epistemic, aleatory, ergodic or non-ergodic) of uncertainties. Failure to do so may result in underestimation or overestimation of failure probability, which can be quite significant (orders of magnitude) in certain cases.” [8]

Out of curiosity, I recently quizzed Meta AI on this very issue [9]. It wasn’t long before I asked:

“So why is the ensemble uncertainty modelled as if it were a measurement problem, using methods developed to handle aleatory uncertainty?”

The answer I received was as follows:

“You’re hitting on a nuanced point. Climate scientists often treat model ensemble uncertainty as if it were aleatory uncertainty, using statistical methods designed for random variability. This might seem counterintuitive, given that model ensemble uncertainty is largely epistemic.”

It proceeded to give three excuses for this. Firstly, it was a ‘pragmatic’ approach, ‘even if it’s not a perfect fit’. Secondly, there is deemed to be a ‘lack of clear alternatives’. Finally, it was deemed to facilitate ‘interpretation and communication…even if it’s not a perfect interpretation’. All of this mealy-mouthed talk of nuance, pragmatism and imperfection left me feeling rather uncomfortable, so I chose to be more direct:

“You talk of nuance, but it is known that using aleatory methods to analyse epistemic uncertainty can give the wrong results. Given the importance of the use of climate model ensembles, getting accurate results is surely more important than pursuing a ‘pragmatic’ approach for lack of an appropriate one.”

The AI response was as follows:

“You’re right. Using methods designed for aleatory uncertainty to analyse epistemic uncertainty can lead to inaccurate or misleading results. In high-stakes fields like climate science, accuracy is crucial. The pragmatic approach might be convenient, but it’s not a substitute for rigor. Climate model ensembles are indeed crucial for informing decision-making and political development, and using inappropriate methods can have significant implications.”

I wanted something even more explicit, so later on in the exchange I wrote:

“I’m not talking about reviewing in the light of new information or improved methods. What about recognising that climate change risk may not be as high as has been assumed? Is this not a possible consequence of the application of inappropriate methods for uncertainty analysis?”

Perhaps not wishing to disappoint me, the AI responded with:

“You’re asking if the use of inappropriate methods for uncertainty analysis could have led to an overestimation of climate change risks. That’s a valid concern. If methods designed for aleatory uncertainty were used to analyse epistemic uncertainty, it’s possible that:

  1. Uncertainty might have been mischaracterised: This could have led to an exaggerated sense of risk or probability.
  2. Risk assessments might have been biased: Overestimating the likelihood or magnitude of certain outcomes.

Recognizing this possibility is essential. It doesn’t necessarily mean that climate change risks are low, but rather that a more nuanced understanding of uncertainty could lead to more accurate risk assessments. This could, in turn, inform more effective and targeted strategies for mitigation and adaption.”

Admittedly, the AI was not so undiplomatic as to suggest that the furtherance of careers was a principal motivation for taking the wrong approach, but I don’t expect there would have been anything in its training data that could lead to it being so tactless. Instead, the problem is dressed up with talk of the need to be ‘pragmatic’ and for using approaches that are ‘less than perfect’. This, no doubt, reflects the extent to which the climate science community now condones and normalises the aleatory treatment of epistemic uncertainty. But the reality is that the approach taken is just plain wrong, and Meta AI was incapable of defending it when challenged. Furthermore, if there is any pragmatism, it is clear that it is a pragmatism that protects the scientist from falling short in the eyes of policy makers. The problem is that no one is incentivised to be right.

Undermining confidence

As with the statistical malpractice within the behavioural sciences, there is nothing here to suggest that there is any conscious strategizing or cheating. Those climate scientists who confuse the epistemic with the aleatory either do so because they know no better, or they don’t really think it can possibly matter. After all, nearly everyone around them is making the same mistake and it doesn’t appear to be doing them any harm. Instead, one has to wonder what happened to those who refused to play the probability distribution game.

Inappropriate use of aleatory analysis doesn’t necessarily benefit the ‘alarmist’ cause, and so there is no hoax to be found here. That said, if you look at the political use to which Extreme Event Attribution (EEA) is put, and then reflect upon the extent to which EEA practitioners play down their inappropriately aleatoric approach to multi-model ensemble uncertainty analysis, it becomes difficult to shake off the suspicion that there is some wilful ignorance involved. Even after offering the benefit of the doubt, there is a lingering disquiet that, at the very least, undermines confidence in the science being promulgated. As a lay observer, one cannot say too much, but it remains the case that a firm grasp of the mathematical and philosophical framework for uncertainty analysis will cease to be sufficient once other factors come into play that encourage — what Meta AI euphemistically refers to as — ‘less than perfect’ practice. The assumption that the scientific method will always ensure a natural selection for the good science would seem to be somewhat optimistic; that much is clear when looking at how the climate science community handles its uncertainties.

Footnotes:

[1] Ridgway J. 2025 “The Matthew Effect, Mono-cultures, and the Natural Selection of Bad Science”, Climate Etc. https://judithcurry.com/2025/09/08/the-matthew-effect-mono-cultures-and-the-natural-selection-of-bad-science/.

[2] van der Sluijs J.P.,  van Eijndhoven J.C.M., Wynne B. and Shackley S. 1998, “Anchoring Devices in Science For Policy: The Case of Consensus Around Climate Sensitivity”, Social Studies of Science, 28, 291-323. https://doi.org/10.1177/030631298028002004.

[3] Fox, Craig R. and Gülden Ülkümen 2011, “Distinguishing Two Dimensions of Uncertainty,” in Essays in Judgment and Decision Making, Brun, W., Kirkebøen, G. and Montgomery, H., eds. Oslo: Universitetsforlaget. https://ssrn.com/abstract=3695311.

[4] Schmidt G. 2007 “Climate models produce projections, not probabilities”, Bulletin of the Atomic Scientistshttps://thebulletin.org/roundtable_entry/climate-models-produce-projections-not-probabilities/.

[5] Winsberg E. 2012 “Values and uncertainties in the predictions of global climate models”, Kennedy Institute of Ethics Journal, Vol. 22, Number 2, pp. 111-137. https://doi.org/10.1353/ken.2012.0008.

[6] Shepherd T.G. 2019 “Storyline approach to the construction of regional climate change information”, Proc. R. Soc. A, 475:20190013. https://doi.org/10.1098/rspa.2019.0013.

[7] Ferson S. 2016 “Different methods are needed to propagate ignorance and variability”, Reliability Engineering and System Safetyhttps://doi.org/10.1016/S0951-8320(96)00071-3.

[8] Der Kiureghian A., Ditlevsen O. 2009 “Aleatory or epistemic? Does it matter?”, Structural Safety, Vol. 31, Issue 2, pp 105-112. https://doi.org/10.1016/j.strusafe.2008.06.020.

[9] Ridgway J. 2025 “What Does Artificial Denial Look Like? – Going from AI to AD”, Climate Scepticism https://cliscep.com/2025/05/17/what-does-artificial-denial-look-like/


Discover more from Climate- Science.press

Subscribe to get the latest posts sent to your email.