What is model calibration in risk management and how do you avoid overfitting?
I keep seeing 'model calibration' in the FRM curriculum but I'm not clear on how it differs from just fitting a model to data. Also, the material warns about overfitting — how do you know if your risk model is capturing real patterns vs. just memorizing historical noise?
Model calibration is one of the most practically important topics in quantitative risk management. It's the process of selecting model parameters so that the model's outputs match observed market data as closely as possible — while remaining robust out of sample.
Calibration vs. Estimation:
| Concept | What It Does | Example |
|---|---|---|
| Estimation | Fits parameters to historical data | Estimate GARCH(1,1) from 5 years of returns |
| Calibration | Adjusts parameters to match current market prices | Calibrate a volatility surface to today's option prices |
Estimation looks backward; calibration looks at the present. For risk management, both are used — estimation for historical models, calibration for pricing models.
Calibration Example — Whitfield Derivatives:
Whitfield calibrates a local volatility model to match 30 listed option prices on an equity index. The objective function minimizes:
sum_{i=1}^{30} (Model_Price_i - Market_Price_i)^2
They have 8 model parameters. After optimization, the model matches all 30 prices within $0.02. Is this good? Not necessarily.
The Overfitting Problem:
If Whitfield had used 30 parameters for 30 prices, they'd get a perfect fit — but the model would have zero predictive power. Each parameter would memorize one data point rather than capturing the underlying volatility dynamics.
Signs of Overfitting:
- In-sample fit is excellent but out-of-sample performance collapses — the model predicts last month perfectly but fails on new data
- Parameters are unstable — recalibrating daily produces wildly different parameters
- Parameters lack economic meaning — values that make no financial sense (e.g., negative volatility of volatility)
- Excessive complexity — more parameters than justified by the data
How to Avoid Overfitting:
- Parsimony: Use the simplest model that captures the essential features. The EWMA model (1 parameter) often outperforms complex models in VaR forecasting.
- Out-of-sample testing: Calibrate on 80% of data, test on 20%. If performance degrades sharply, you're overfitting.
- Regularization: Add a penalty term for parameter magnitude: minimize [Sum of squared errors + lambda x Sum of parameters^2].
- Cross-validation: Rotate which data is in-sample vs. out-of-sample.
- Economic constraints: Restrict parameters to economically sensible ranges.
Backtesting as the Final Check:
After calibration, backtest the model. For VaR at 99%, you expect ~2.5 exceptions per year (250 trading days x 1%). If you see 0 exceptions, the model is too conservative. If you see 10+, it's too aggressive. The Basel traffic light system formalizes this:
- Green: 0-4 exceptions
- Yellow: 5-9 exceptions
- Red: 10+ exceptions
For exam preparation on model risk topics, check our FRM question bank.
Master Part I with our FRM Course
64 lessons · 120+ hours· Expert instruction
Related Questions
How exactly do futures margin calls work, and what happens if I can't meet one?
How do you calculate the settlement amount on a Forward Rate Agreement (FRA)?
When should I use Monte Carlo simulation instead of parametric VaR, and how does it actually work?
Parametric VaR vs. Historical Simulation VaR — when does each method fail?
What are the core components of an Enterprise Risk Management (ERM) framework, and how does it differ from siloed risk management?
Join the Discussion
Ask questions and get expert answers.