A
AcadiFi
QM
QuantStrat_Maya2026-04-12
cfaLevel IIQuantitative Methods

How does the bias-variance tradeoff affect model selection for investment return forecasting?

I'm studying CFA quantitative methods and struggling with the bias-variance decomposition. My professor says a model can have low training error but terrible out-of-sample results. How should I think about balancing bias and variance when choosing between a simple linear model and a complex machine learning model for predicting equity returns?

134 upvotes
AcadiFi TeamVerified Expert
AcadiFi Certified Professional

The bias-variance tradeoff is the fundamental tension in statistical modeling between underfitting (high bias) and overfitting (high variance). Every predictive model's total error decomposes into three components.\n\nError Decomposition:\n\nTotal Error = Bias^2 + Variance + Irreducible Noise\n\n- Bias measures systematic error from simplifying assumptions. A linear model predicting nonlinear equity returns will consistently miss curved relationships.\n- Variance measures sensitivity to training data fluctuations. A 50-variable neural network might fit training data perfectly but produce wildly different predictions on new data.\n- Irreducible noise is randomness inherent in markets that no model can capture.\n\n`mermaid\ngraph TD\n A[\"Model Complexity\"] --> B{\"Low Complexity\"}\n A --> C{\"High Complexity\"}\n B --> D[\"High Bias
Underfitting
Misses real patterns\"]\n B --> E[\"Low Variance
Stable predictions\"]\n C --> F[\"Low Bias
Captures patterns\"]\n C --> G[\"High Variance
Overfitting
Fits noise\"]\n D --> H[\"Sweet Spot:
Optimal complexity
minimizes total error\"]\n G --> H\n`\n\nPractical Example:\nMeridian Capital wants to forecast monthly returns for a 200-stock universe. They test three approaches:\n\n| Model | Features | Training R-squared | Test R-squared |\n|---|---|---|---|\n| Simple linear (3 factors) | Market, Size, Value | 8.2% | 7.1% |\n| Polynomial regression (20 terms) | Interactions + squared | 22.5% | 3.8% |\n| Regularized regression (LASSO) | 40 candidates, 9 selected | 14.3% | 11.6% |\n\nThe polynomial model has the best training fit but worst test performance — classic high variance. The simple model has stable but mediocre results — high bias. LASSO achieves the best tradeoff by automatically shrinking unimportant coefficients to zero.\n\nModel Selection Guidelines:\n1. Always evaluate on out-of-sample data, never training data\n2. Prefer simpler models unless complexity provides meaningful improvement on test data\n3. Use regularization (Ridge, LASSO, Elastic Net) to control variance without sacrificing too much bias\n4. Financial data is inherently noisy — variance reduction typically matters more than bias reduction\n\nFor more on machine learning in finance, explore our CFA Quantitative Methods course.

📊

Master Level II with our CFA Course

107 lessons · 200+ hours· Expert instruction

#bias-variance#model-selection#overfitting#machine-learning#regularization