A
AcadiFi
ET
ElasticBlend_Tobias2026-04-06
cfaLevel IIQuantitative Methods

How does elastic net combine L1 and L2 penalties, and when does it outperform pure LASSO or ridge?

The CFA curriculum mentions elastic net as a compromise between ridge and LASSO. I know it uses both penalties, but I'm unsure about the mixing parameter alpha. When exactly does elastic net give meaningfully better results than using either method alone?

86 upvotes
Verified ExpertVerified Expert
AcadiFi Certified Professional

Elastic net combines the L1 (LASSO) and L2 (ridge) penalties through a mixing parameter alpha, inheriting LASSO's variable selection ability and ridge's stability with correlated predictors. It overcomes specific weaknesses that each method has individually.\n\nObjective Function:\n\nElastic net minimizes: sum of (y_i - X_i x beta)^2 + lambda x [alpha x sum|beta_j| + (1 - alpha) x sum(beta_j^2)]\n\n- alpha = 1: pure LASSO\n- alpha = 0: pure ridge\n- 0 < alpha < 1: elastic net blend\n\nWhy Pure LASSO Fails with Correlated Predictors:\n\nWhen two predictors are highly correlated (say, rho > 0.9), LASSO arbitrarily selects one and zeros out the other. This is problematic when both variables carry meaningful information. Ridge keeps both but cannot eliminate truly irrelevant variables.\n\nWorked Example:\n\nAnalyst Tobias at Riverdale Quant models credit default probability using 20 financial ratios. Several ratios are grouped by category (three leverage ratios are correlated at 0.85+, four profitability ratios at 0.90+).\n\n| Method | Variables Selected | CV Error (bps) |\n|---|---|---|\n| OLS (all 20) | 20 | 145 |\n| Ridge (lambda = 3.1) | 20 (all nonzero) | 98 |\n| LASSO (lambda = 1.8) | 6 | 87 |\n| Elastic Net (alpha = 0.5, lambda = 2.2) | 9 | 72 |\n\nLASSO picks one leverage ratio and one profitability ratio, discarding the others arbitrarily. Elastic net retains two leverage ratios and two profitability ratios that are genuinely predictive, while still zeroing out the 11 noise variables.\n\nGrouped Selection Property:\n\nElastic net's signature advantage is grouped selection: it tends to include or exclude correlated variables together rather than making arbitrary choices among them. This produces more stable and interpretable models.\n\nTuning:\n\nElastic net requires tuning two hyperparameters (alpha and lambda), typically via a 2D grid search with cross-validation:\n\n1. Create a grid of alpha values (e.g., 0.1, 0.3, 0.5, 0.7, 0.9)\n2. For each alpha, find optimal lambda via K-fold cross-validation\n3. Select the (alpha, lambda) pair with lowest CV error\n\nWhen to Use Each Method:\n- Ridge: Many predictors, all relevant, high multicollinearity\n- LASSO: Many predictors, most are noise, limited correlation among signal variables\n- Elastic Net: Many predictors, groups of correlated variables, need both selection and stability\n\nPractice regularization comparisons in our CFA Quantitative Methods question bank.

📊

Master Level II with our CFA Course

107 lessons · 200+ hours· Expert instruction

#elastic-net#l1-l2#regularization#grouped-selection#correlated-predictors