Financial risk models have been widely criticised for both theoretical and practical failures, especially during the recent financial crisis. In spite of this, all proposals for reforming model use have been resisted. This is not surprising given how deeply ingrained models are in the practice of finance.
Such sentiments are eloquently stated in the conclusion of a comment on a recent Vox article;
“As a risk manager I fully recognise the shortcomings of any model based on or calibrated to the past. But I also need something practical, objective, and understandable to measure risk, set and enforce limits, and encourage discussions about positions when it matters. It is very easy to criticise from the sidelines - please offer an alternative the next time.”
Jan-Peter Onstwedder, Comment on Danielsson (2011)
Our aim here is to respond to the challenges such as those in Jan-Peter's comment by making specific proposals for how risk models should be used in practice, and identifying how problems with models can be avoided. For a background on the theoretical aspects of risk models see Danielsson (2009, 2011). A practical analysis of models can be found in Macrae and Watkins (1998).
Nature of risk and of risk models
Financial risk is a forecast, not a measurement. Every risk forecast is an uncertain assessment of the underlying risk factors, often with wide confidence intervals, resulting from parameter uncertainty, model error and data snooping, and usually containing an uncomfortably large subjective element. Even nonparametric estimates will require choices such as estimation period.
Financial risk can only be understood in terms of a model. It may be a formal model, but whenever a user adopts some rule for controlling risk there must be a model implied by the rules adopted. For example, lending ratio restrictions imply a simple model that more bank loans lead to greater risk. A more complex model incorporating different levels of loan risk and operational risk is implicit in the Basel II risk weightings.
Despite model dependency and uncertainty, there is a tendency by end users to perceive numbers representing risk as coming from a scientific measurement – using a Riskometer in the language of Danielsson (2009)– rather than from an uncertain statistical procedure. Users need numbers they can use to convince their boss, client or regulator, so users of risk models prefer "objective" risk forecasts whereas forecasts accompanied by qualifications and uncertainties appears less objective.
We suspect this leads users to prefer commercial risk software that provides a single number, unencumbered by confidence intervals even though this makes it particularly hard for users to evaluate the reliability of off-the-shelf models. Where confidence intervals are estimated, their reliability is often suspicious. This is succinctly illustrated by David Viniar, Goldman's chief financial officer, stating: "We were seeing things that were 25-standard deviation moves, several days in a row" (Financial Times 2007). This can only mean that Goldman grossly underestimated its standard deviations, making confidence intervals far too tight.
Why are uncertainties in risk forecasts so high?
There are several reasons why uncertainties in risk forecasting are higher than is usually assumed:
- The model estimation period is too short;
- There are structural breaks during the estimation period;
- Data snooping and model optimisation occur;
- Portfolios are optimised, maximising errors;
- It is often necessary to forecast extreme risks.
Since the first two issues are well known, we want to focus on the final three.
Data snooping and model optimisation
Every student of econometrics is taught the danger of data snooping. If we run a single regression, we get correct confidence intervals for parameter estimates and forecasts, subject to certain basic assumptions. If, however, we arrive at the same model as a result of optimising a number of explanatory variables and model specifications these assumptions are violated and the confidence intervals will be underestimated. The more complex the model and the smaller the dataset, the larger the underestimation becomes.
The misleading inference that data snooping can cause is demonstrated by Sullivan et al. (1999), who show that apparently statistically significant technical trading rules are not significant if confidence intervals are calculated correctly, taking into account the search for the best model.
Similar effects are at work in risk forecasting. Risk models are routinely validated by back-testing, that is, by examining how well a model forecasts market outcomes that have already happened. If the model performs badly it will be changed, and the end result is certain to perform well in-sample, over the back-testing period.
Such common approaches to risk modelling tell us more about the level of model optimisation than about how the model will perform out-of-sample in the future. Most risk models in practice appear to us to overemphasise their ability to fit past events, rather than out-of-sample risk forecasting. Risk models must be parsimonious, and tested over a variety of market turmoil if they are to minimise the problem of data snooping and model optimisation. The model best at forecasting is unlikely to be the best at capturing historic events with great accuracy.
This imposes a fundamental limit on what risk systems can achieve, especially in a crisis, because parsimonious models cannot deliver great precision but non-parsimonious models are likely to fail out-of-sample.
Portfolio optimisation and error maximisation
A related problem arises from the use of risk models in portfolio optimisation and risk control. Where risk models are a direct input into trading decisions, providing hard constraints on risky positions, the underlying trading process and portfolios will in all likelihood adapt to and exploit model weaknesses.
This problem arises since traders optimise portfolios towards low reported risk (or equivalently low capital usage) and high returns, causing trading decisions to become biased towards assets with under-forecast risk. In other words, the trader maximises exposure to the part of the asset universe with biased risk forecasts, maximising the impact that this error has on the portfolio. Such error maximisation can affect individual trading positions, institutions, and even the financial system as a whole, as illustrated by the recent crisis.
Prior to the crisis, many structured credit products, such as certain CDO tranches, had undeserved AAA credit ratings. As many investors correctly perceived the risk of such AAA tranches as higher than the risk of corporate AAA bonds, their yields were typically somewhat higher than corporate AAA yields. This in turn made such tranches attractive to less sophisticated investors who evaluated risk solely on the basis of credit ratings.
It is not the size of the pricing bias nor the magnitude of the event that is the main culprit here; the CDO market is a relatively small part of total financial assets. The problem is that the presence of tightly-binding constraints based on inaccurate models of risk (and the consequent error maximisation) motivated certain financial institutions to acquire large exposures to these assets. This led to concentrated losses with damaging systemic consequences.
Error maximisation, as an active risk management leads to reduced volatility and fatter tails. The risk in common events is better managed, at the expense of bigger and more frequent extreme events. The more rigorous risk models are used to constrain positions, the more errors will be maximised and the more dramatic will be the consequences when the errors are eventually revealed.
All risk models contain errors and are thus vulnerable to error maximisation. The more widely a model is used and the more tightly a constraint binds, the worse the error maximisation becomes. This argues for heterogeneity in risk models. In the worst case, where a single model or approach is given regulatory force and applied as a hard constraint to many portfolios, a small problem in micro-prudential regulations may be elevated to a systemic level.
Risk managers are well aware of the potential for error maximisation. However, we suspect this is not well understood by senior management nor properly considered by designers of financial regulations.
This imposes a second fundamental limit to what a risk system can be expected to achieve, because risk systems used to constrain portfolios will have been compromised by the implicit optimisation of portfolios to contain assets for which risk systems underestimate risk. Risk systems that have been used to constrain positions will always prove unreliable in a crisis.
Extreme risk forecasts
Perhaps the greatest need for models is in the forecasting of extreme risk or tail risk, especially during periods of financial crisis and extreme market turmoil. This however, is the area where risk models are least reliable because the effective sample size of comparable events is very small. At worst there might be one observation or even zero when we wish to consider events not yet seen.
Over the past half century we have observed fewer than 10 episodes of extreme international market turmoil. Each of these events is essentially unique, and apparently driven by different underlying causes. Trying to get an overall idea of the statistical process of data during those episodes with fewer than 10 episodes of turmoil, all with different underlying causes is difficult to the point of impossible. While it might be possible to construct a model fitting 9 crisis events in a row, there is no guarantee that it will perform well during the 10th.
Nor does it seem likely that we can get much information about price dynamics during turmoil by using the non-crisis data that makes up the bulk of available information since there is ample evidence that market dynamics are very different in times of crisis. Market lore suggests that in a crisis traders rely more on simple rules of thumb (such as “all stocks have a beta of one”, or even “cash is king”) than in more nuanced normal times. This is supported by academic studies, such as Ang et al. (2002), showing that that correlations go to one during crises (manifestation of nonlinear dependence), because of incentives to trade out of risky assets into safe assets when risk constraints bind, causing a feedback between ever higher risk and sharper constraints (see Danielsson et al. 2010).
This is the third fundamental limit to what risk system can be expected to achieve. Regardless of how much data we have, there is never enough to reliably estimate the tails. This is why models for extreme risk can be expected to fail during market turmoil or crises.
In our next column, we consider how the intrinsic shortcomings of risk models matter for their four main uses. We also make some suggestions on how the financial industry and supervisors should use models in practice
Danielsson, Jon (2009), "The myth of the Riskometer”, VoxEU.org, 5 January.
Danielsson, Jon (2011), “Risk and crises”, VoxEU.org, 18 February
Danielsson, Jon, Hyun Song Shin, and Jean-Pierre Zigrand (2010), “Risk Appetite and Endogenous Risk”, Financial Markets Group Working Papers.
Macrae, Robert and Chris Watkins (1998), “A Disaster Waiting to Happen”.
Sullivan, Ryan, Alan Timmermann and Hal White (1999), “Data--Snooping, Technical Trading Rule Performance, and the Bootstrap”, Journal of Finance.
Ang, A and JS Chen (2002), “Asymmetric correlations of equity portfolios”, Journal of Financial Economics, 63(3):443-494.