Why is backtesting expected shortfall (ES) so much harder than backtesting VaR?
I understand that Basel moved from VaR to expected shortfall for market risk capital. But my FRM material says ES is difficult to backtest, which is a problem because backtesting is how regulators verify model accuracy. What makes ES backtesting fundamentally harder, and what are the proposed solutions?
The shift from VaR to expected shortfall (ES) in the Fundamental Review of the Trading Book (FRTB) introduced a significant practical challenge: ES is much harder to backtest than VaR. This tension is a key topic in FRM Part II.
Why VaR Is Easy to Backtest
VaR backtesting is straightforward because it only checks a binary outcome: did the actual loss exceed the VaR estimate? With 99% VaR over 250 trading days, you expect about 2.5 exceptions. You count exceptions and run a binomial test (Basel traffic light system).
Why ES Backtesting Is Hard
Problem 1 — ES is about the magnitude of tail losses, not just the count
ES = E[Loss | Loss > VaR]. You need to verify not just that losses exceed VaR the right number of times, but that the average of those exceedances matches the ES prediction. With only 2-3 exceptions per year, you have a tiny sample to estimate this average.
Problem 2 — Small sample bias
If you have 250 daily observations and your VaR is at the 97.5% level, you expect only about 6 exceptions. Computing a reliable mean from 6 observations is statistically meaningless. The confidence interval around that estimate is enormous.
Problem 3 — ES is not elicitable (in the traditional sense)
A risk measure is "elicitable" if there exists a scoring function that is uniquely minimized by the correct forecast. VaR is elicitable; ES alone is not. However, ES is jointly elicitable with VaR (the pair is elicitable), which opens up some backtesting approaches.
Problem 4 — Regime changes
Tail events are by definition rare. A model calibrated to quiet markets will fail in a crisis, but you won't know it's failing until the crisis is well underway.
Proposed Solutions
- Basel's approach: Backtest VaR, calibrate ES
FRTB backtests VaR at both the 97.5% and 99% levels. If VaR passes backtesting, the ES estimate (derived from the same model) is assumed to be reasonable. ES is then scaled by a multiplier.
- Acerbi-Szekely test
Uses a test statistic based on the average of realized losses beyond VaR, standardized by the ES estimate. Requires fewer observations than direct estimation.
- Multinomial approach
Divide the tail into multiple bins (e.g., 97.5%-99%, 99%-99.5%, 99.5%+) and test whether the observed distribution across bins matches the predicted distribution.
- Ridge backtesting
Combines VaR and ES information into a joint test, exploiting their joint elicitability.
FRM exam tip: Know that the FRTB relies on VaR backtesting as a proxy for ES validation, and understand why direct ES backtesting remains an open research problem.
For more on FRTB and ES methodology, check our FRM Part II course.
Master Part II with our FRM Course
64 lessons · 120+ hours· Expert instruction
Related Questions
How exactly do futures margin calls work, and what happens if I can't meet one?
How do you calculate the settlement amount on a Forward Rate Agreement (FRA)?
When should I use Monte Carlo simulation instead of parametric VaR, and how does it actually work?
Parametric VaR vs. Historical Simulation VaR — when does each method fail?
What are the core components of an Enterprise Risk Management (ERM) framework, and how does it differ from siloed risk management?
Join the Discussion
Ask questions and get expert answers.