The Reliability of Credit Risk Models

Banks have recently developed new techniques for gauging the credit risk associated with portfolios of illiquid, defaultable instruments. These techniques could revolutionize banks' management of credit risk and could, in the longer term, serve as a more risk-sensitive basis for prudential regulation than the current 8% capital requirement. At a lunchtime meeting held in London, William Perraudin considered the reliability of the two main types of credit risk model developed so far. Using price data on large Eurobond portfolios, Perraudin assessed, on an out-of-sample basis, how well these models track the risks they claim to measure.

The systematic application of Value at Risk (VaR) models by large international banks has significantly enhanced their ability to measure and hedge their trading book risks. A valuable side-effect of the new emphasis on VaR modelling (where the VaR estimates the loss that will be exceeded on some given fraction of occasions if the portfolio in question is held for a particular period) is that regulators have been able to reduce the distortionary impact of prudential capital requirements for banks' trading portfolios. 

Perraudin began by explaining the self-reinforcing nature of the developments in credit risk models. Regulators are considering changes because banks are using securitization and credit derivative transactions to arbitrage capital requirements, thus eroding the capital cushion necessary to maintain financial stability. Much of the liquidity in the new markets is being supplied by capital arbitrage, encouraging the emergence of these new markets. Hence, the prospect that regulators may allow banks to use model output (in some form yet to be announced) in regulatory capital calculations is spurring the development of these models. By using these models, banks are better able to identify which parts of their portfolio require low economic capital and are therefore candidates for capital arbitrage under current rules.

The fundamental difficulty in assessing credit risk is that most credit exposures have no easily observed market price. This lack of information means that credit risk estimates must be based on other kinds of data. The two approaches in current use are ratings-based methods (e.g. J P Morgan's Creditmetrics) and equity-based models (e.g. KMV's Merton-style model).

Ratings-based techniques attribute a rating to each defaultable investment in a portfolio and then estimate the probability of upward or downward moves in ratings using historical data on ratings transitions for different traded bond issues. These probabilities are then combined with the average spreads for bonds from different ratings categories so as to derive mean and volatility estimates for the return on each credit exposure. By assuming approximate joint normality of the returns, a VaR for the total credit risk can be derived by using the portfolio volatility and the expected return.

The alternative equity-based approach starts from the observation that, under limited liability, a firm's equity value is a call option written on the firm's underlying assets. By using standard option pricing formulae, it is possible to infer from the equity and liability values of a firm the level and distribution of the firm’s underlying assets. And assuming some trigger level for bankruptcy, the probability of default can be estimated. Hence the means, variances and covariances of pairs of bonds can be calculated by integrating numerically over the estimated distribution of changes in the underlying assets. As in the ratings-based approach, by assuming approximate normality of the portfolio value, a VaR can be derived from the portfolio mean and variance.

Each of these two methods has advantages and disadvantages in coverage of 'obligors', since some borrowers are unrated but have equity market quotes, whereas other borrowers are rated but not quoted. More broadly, which method works best is likely to reflect the information content of agency ratings versus that of equity market values.

Perraudin had carried out out-of-sample back testing in order to assess the relative performance of each of these models. He had implemented the two models month by month, calculating in each period a credit risk VaR for the following year. Only lagged data, which would be available to an analyst implementing the model in the given period, were employed so that the evaluation was genuinely out-of-sample. To assess the models' performance Perraudin compared the estimated VaRs with the actual output of the portfolio in question one year later. If the models had supplied unbiased VaR estimates then the fraction of occasions in which losses exceed the VaRs would roughly equal the VaR confidence level. 

The credit exposures which Perraudin had examined were large portfolios of dollar-denominated Eurobonds. This unusually rich dataset included 1,430 bond price histories observed in the period 1988–98. All the bonds were straight bonds with no call or put features. In order to implement ratings- and equity-based credit risk models on the same data, Perraudin constructed datasets of equity and liability values for the bond obligors in the sample and their ratings history.

Perraudin's main finding was that both ratings-based models and equity-based models benchmarked to default probabilities tended to underestimate the riskiness of the bond portfolios. Most of the portfolios examined experienced significantly more losses in excess of the VaR estimates supplied by the models. In the case of the ratings-based model, the problem originates from the model’s underestimation of the risks associated with non-US and non-industrial obligors. However, if the equity-based model is benchmarked using bond spreads rather than default probabilities then it yields much more conservative and possibly unbiased risk estimates. Perraudin noted that the current industry practice is to benchmark against default probability, and that benchmarking against spreads is simply not feasible in the case of loan portfolios as there are no mark-to-market values available.

Perraudin stated that these findings should not be regarded as too negative. They show that models must be used cautiously and particular care must be taken when models are applied to exposures that fall outside the set of credit risks which have been studied and are reasonably well understood (i.e. US industrials). A major problem with credit risk models is that they are difficult to back test since the holding periods are long and the data available are very limited. Nevertheless, Perraudin's study shows that models based on publicly available data are testable. This underlines the fact that there are big advantages to using quantifiable models based as much as possible on public data. The current push by banks to develop their own internal rating systems, which are often entirely qualitative and not tied to specific ranges of expected loss or default probabilities, is not ideal as such approaches will not be testable for a very long time, if at all.

In conclusion, Perraudin returned to his original question and the title of the meeting: are credit risk models reliable? If implemented in a simple, uncritical way then the answer is clearly no. They yield too many exceptions. But if credit risk models are used conservatively then they may be a useful tool and even a basis for capital allocation. It is important to try to design models that have testable implications so that their output can be checked.

Perraudin presented research produced under the auspices of an ESRC ‘Reaching Our Potential Award’.

Ratings- Versus Equity-Based Credit Risk Modeling: An Empirical Analysis’ by Pamela Nickell (Bank of England), William Perraudin (Birkbeck College, London, Bank of England and CEPR) and Simone Varotto (Bank of England).