VoxEU Column Financial Markets

Blame the models

In response to financial turmoil, supervisors are demanding more risk calculations. But model-driven mispricing produced the crisis, and risk models don’t perform during crisis conditions. The belief that a really complicated statistical model must be right is merely foolish sophistication.

A well-known American economist, drafted during World War II to work in the US Army meteorological service in England, got a phone call from a general in May 1944 asking for the weather forecast for Normandy in early June. The economist replied that it was impossible to forecast weather that far into the future. The general wholeheartedly agreed but nevertheless needed the number now for planning purposes.

Similar logic lies at the heart of the current crisis

Statistical modelling increasingly drives decision-making in the financial system while at the same time significant questions remain about model reliability and whether market participants trust these models. If we ask practitioners, regulators, or academics what they think of the quality of the statistical models underpinning pricing and risk analysis, their response is frequently negative. At the same time, many of these same individuals have no qualms about an ever-increasing use of models, not only for internal risk control but especially for the assessment of systemic risk and therefore the regulation of financial institutions.1 To have numbers seems to be more important than whether the numbers are reliable. This is a paradox. How can we simultaneously mistrust models and advocate their use?

What’s in a rating?

Understanding this paradox helps in understanding both how the crisis came about and the frequently inappropriate responses to the crisis. At the heart of the crisis is the quality of ratings on structured investment vehicles (SIVs). These ratings are generated by highly sophisticated statistical models.

Subprime mortgages have generated most headlines. That is of course simplistic. A single asset class worth only $400 billion should not be able to cause such turmoil. And indeed, the problem lies elsewhere, with how financial institutions packaged subprime loans into SIVs and conduits and the low quality of their ratings.

The main problem with the ratings of SIVs was the incorrect risk assessment provided by rating agencies, who underestimated the default correlation in mortgages by assuming that mortgage defaults are fairly independent events. Of course, at the height of the business cycle that may be true, but even a cursory glance at history reveals that mortgage defaults become highly correlated in downturns. Unfortunately, the data samples used to rate SIVs often were not long enough to include a recession.

Ultimately this implies that the quality of SIV ratings left something to be desired. However, the rating agencies have an 80-year history of evaluating corporate obligations, which does give us a benchmark to assess the ratings quality. Unfortunately, the quality of SIV ratings differs from the quality of ratings of regular corporations. A AAA for a SIV is not the same as a AAA for Microsoft.

And the market was not fooled. After all, why would a AAA-rated SIV earn 200 basis points above a AAA-rated corporate bond? One cannot escape the feeling that many players understood what was going on but happily went along. The pension fund manager buying such SIVs may have been incompetent, but he or she was more likely simply bypassing restrictions on buying high-risk assets.

Foolish sophistication

Underpinning this whole process is a view that sophistication implies quality: a really complicated statistical model must be right. That might be true if the laws of physics were akin to the statistical laws of finance. However finance is not physics, it is more complex, see e.g. Danielsson (2002).

In physics the phenomena being measured does not generally change with measurement. In the finance that is not true. Financial modelling changes the statistical laws governing the financial system in real-time. The reason is that market participants react to measurements and therefore change the underlying statistical processes. The modellers are always playing catch-up with each other. This becomes especially pronounced when the financial system gets into a crisis.

This is a phenomena we call endogenous risk, which emphasises the importance of interactions between institutions in determining market outcomes. Day-to-day, when everything is calm, we can ignore endogenous risk. In crisis, we cannot. And that is when the models fail.

This does not mean that models are without merits. On the contrary, they have a valuable use in the internal risk management processes of financial institutions, where the focus is on relatively frequent small events. The reliability of models designed for such purposes is readily assessed by a technique called backtesting, which is fundamental to the risk management process and is a key component in the Basel Accords.

Most models used to assess the probability of small frequent events can also be used to forecast the probability of large infrequent events. However, such extrapolation is inappropriate. Not only are the models calibrated and tested with particular events in mind, but it is impossible to tailor model quality to large infrequent events nor to assess the quality of such forecasts.

Taken to the extreme, I have seen banks required to calculate the risk of annual losses once every thousand years, the so-called 99.9% annual losses. However, the fact that we can get such numbers does not mean the numbers mean anything. The problem is that we cannot backtest at such extreme frequencies. Similar arguments apply to many other calculations such as expected shortfall or tail value-at-risk. Fundamental to the scientific process is verification, in our case backtesting. Neither the 99.9% models, nor most tail value-at-risk models can be backtested and therefore cannot be considered scientific.

Demanding numbers

We do however see increasing demands from supervisors for exactly the calculation of such numbers as a response to the crisis. Of course the underlying motivation is the worthwhile goal of trying to quantify financial stability and systemic risk. However, exploiting the banks’ internal models for this purpose is not the right way to do it. The internal models were not designed with this in mind and to do this calculation is a drain on the banks’ risk management resources. It is the lazy way out. If we don't understand how the system works, generating numbers may give us comfort. But the numbers do not imply understanding.

Indeed, the current crisis took everybody by surprise in spite of all the sophisticated models, all the stress testing, and all the numbers. I think the primary lesson from the crisis is that the financial institutions that had a good handle on liquidity risk management came out best. It was management and internal processes that mattered – not model quality. Indeed, the problem created by the conduits cannot be solved by models, but the problem could have been prevented by better management and especially better regulations.

With these facts increasingly understood, it is incomprehensible to me why supervisors are increasingly advocating the use of models in assessing the risk of individual institutions and financial stability. If model-driven mispricing enabled the crisis to happen, what makes us believe that the future models will be any better?

Therefore one of the most important lessons from the crisis has been the exposure of the unreliability of models and the importance of management. The view frequently expressed by supervisors that the solution to a problem like the subprime crisis is Basel II is not really true. The reason is that Basel II is based on modelling. What is missing is for the supervisors and the central banks to understand the products being traded in the markets and have an idea of the magnitude, potential for systemic risk, and interactions between institutions and endogenous risk, coupled with a willingness to act when necessary. In this crisis the key problem lies with bank supervision and central banking, as well as the banks themselves.


Danielsson, Jon (2002), “The Emperor has no Clothes: Limits to Risk Modelling”, Journal of Banking and Finance, 26(7),1273—1296.




1 For example, see Nassim Taleb (2007). "Fooled by randomness: the hidden role of chance in life and the markets" Penguin Books.



735 Reads