Models Behaving Badly

  • Date: 23/04/14
  • Dr David Whitehouse

Science has many useful models including those of the atom, the Solar System, the dynamics of the Universe, the drifting of the continents, the movement of water across a landscape, the flow of traffic, the spread of disease, the behaviour of solids, liquids, gasses and plasmas, the fundamental structure of matter, and of course, the Earth’s climate.

Models go back a long time. Ptolemy’s epicycles and eccentrics were a model to explain the motions of the planets in the sky. Among the first to discuss the philosophical nature of models was Descartes who rightly saw them as mental constructs. He said that models were useful because they are impersonal and one can deduce consequences in subsequent agreement with observations.

Models come and go as observations and concepts develop. The Ptolemaic model still works up to a point, but it has been replaced by Kepler’s laws of motion that work much better and from which the law of gravity was deduced by Newton. Indeed Newton’s famous equation relating gravitational force to the product of masses, the inverse square of distance and a Gravitational Constant, is in itself a model. It’s a very accurate one in most situations, but ultimately Einstein’s General Theory of Relativity relegates it to a weak field approximation, as it introduces it’s own models for masses distorting spacetime.

A model then is a simplified mathematical description of reality. There is no single definition of what a model is. For some it is a way of structuring data, applying theories, and testing hypotheses. The Standard Model of fundamental particles is a good example of that. The key point is that for a model to be any good it has to make predictions. But those predictions must be seen alongside the limitations on the predictions it can make.

Under The Hood

Climate models are among the most complicated scientific models ever devised. They are simulations of the general circulation of a planetary atmosphere on a rotating sphere with variable radiant input coming from one direction and unevenly smeared over one hemisphere (not coincident with the rotational axis), with losses spilling out in all directions. The models encompass events and structures that span over ten orders of magnitude and incorporate fundamental equations of fluid dynamics, the Ideal Gas law, and heat absorption and transfer between various energy sources such as the Sun and latent heat. It simulates the effects of the composition of the atmosphere, and the effect of its varying parameters with height, and considers over what surface the atmosphere overlays be it ocean, land (many types) or ice. Radiation is absorbed, reradiated and reflected in different ways at different levels and by different things in the atmosphere that change in time over different surfaces that in turn also change over many timescales. Then there are biological effects to simulate.

All this is done using about a million lines of computer code pixelating the atmosphere in small boxes of lateral and vertical aspect all coupled together to build up larger and larger regions of the Earth. Let’s put those million lines of code into context. It’s about the same as an average computer game. It’s less than a modern version of Photoshop or even Windows 3, far less than Google Chrome, a Boeing 747 control system or the Mars Curiosity Rover, and only a few per cent the size of the LHC or even Microsoft Office 2013 and Mac OSX. Facebook has over fifty times more lines of code than climate models. Climate code is often written in Fortran laced together with Unix scripts requiring considerable programming experience to operate.

Today’s climate models are conglomerations of hundreds, if not thousands, of smaller models, each with their own limits and shortcomings, simplifications and computational approximations held together by a glue of assumptions and educated guesses.

It is remarkable to me that such climate models predict anything that remotely resembles the way the climate is changing in the real world, but they do. That is not to say that all climate models agree, they don’t. It’s not uncommon for different climate models to produce regional temperature differences of 5 deg C. In addition climate model outcomes are sensitive to initial starting points and a slight variation can lead to different outcomes. Similar climate models can produce a range of responses and often scientists don’t know why or even which is the better model.

Climate scientists get together to compare models and ask themselves why they are different and how each group allows for poorly understood effects like the feedbacks associated with clouds, or the carbon cycle. One technique used to ‘validate’ models is to calculate a hindcast. It often goes like this. A climate model is developed that reproduces the conditions found at a particular time. It is then run backwards to see if it matches the climate changes that have been observed in the past. If it does the model is said to have some ‘skill.’ But hindcasting is fraught with all sorts of problems. Making sure that past observations used to help develop the climate model are not inadvertently affecting the hindcast’s attempts to reproduce those past observations can be tricky. If one knows the past data then one is prey to all sorts of unappreciated bias’ and selection effects. For this reason forecasts are much more important. Nothing gives one more confidence in a model that having predictive power, as Decartes said.

The problem is that, as has been said often, future data is not currently available. If you want to test a 20-year forecast you would need perhaps 200 years of future data so that the shorter-term forecast can be put into its proper perspective.

Predictive Test

So how are climate models doing in predicting the future of the climate? Not very well. As has been pointed out many times they are running hot, and poorly predicting the Earth’s surface temperature.

In other areas of science if a model fails a prediction there is a question mark over it. If it consistently fails then it is discarded in favour of those models that have a better resemblance with reality. But curiously in climate science it is no dishonour for a model to be unable to predict such an important climatic observable as the Earth’s surface temperature. Instead of following a survival of the fittest strategy all models, including the weak ones that do not work well, are used to calculate an average model response, with the spread of the models output being used to establish confidence limits. This average is then given a greater level of respectability that individual model runs, even the few that do seem to match the observations.

In this way climate models that fail are being treated identically to the few that do accord to reality. This leads one to wonder why even bother to check the climate models against observations if the climate model forecasts a rising temperature! Some scientists treat all the outputs of climate models as individual experiments as though they were governed by the same statistics as repeated measurements of the same thing. They are not. The fact that scientists are often not able to explain why different models produce different outputs using the same input data and their sensitivity to initial conditions means that they cannot be treated the same way as independent observations with a mean calculated along with meaningful confidence limits.

What would happen in other areas of science is that the computer models that reproduce reality would be kept and those that didn’t would be discarded (although worked upon to see if they contained information as to why they were wrong.) Imagine what chaos there would be in other areas of scientific prediction if we took models that had succeeded and diluted them with a much larger number of models that had failed and considered them all as a statistically coherent ‘ensemble’!

It was this frame of mind that meant that in some quarters the ‘pause’ in global surface temperatures seen since 1997 took so long to be appreciated. When it was, it was seen as an aberration – an unusual departure from the climate model’s prediction curve. Wait long enough and nature will return to the model’s ensemble average. Some pointed to the range of possible outcomes of climate models and noted that the observations were inside that range. The models that were poorly representing reality were being used to increase the spread of possible outcomes to a point where it covered the pesky observations. Popper had been spun in his grave because of his belief that theories and models should be attacked by data instead of protected from them.