What are the differences between grid search, random search, and Bayesian optimization for hyperparameter tuning in financial models?

Question

AcadiFi · Accepted Answer

Hyperparameter tuning selects the best configuration for a model's structural settings — values not learned from data but set before training. The three main approaches trade off thoroughness against computational efficiency. **Comparison:** ```mermaid graph TD A["Hyperparameter Space"] --> B["Grid Search"] A --> C["Random Search"] A --> D["Bayesian Optimization"] B --> E["Tests every
combination on grid
Exhaustive but slow"] C --> F["Samples randomly
Surprisingly effective
with many params"] D --> G["Builds surrogate model
Explores intelligently
Fewest evaluations"] E --> H{"Budget: 100 evaluations"} F --> H G --> H H --> I["Select best config
via cross-validation"] ``` **Worked Example:** Pinnacle Systematic has a gradient boosting model with these hyperparameters: | Hyperparameter | Range | |---|---| | n_estimators | [50, 100, 200, 500] | | learning_rate | [0.01, 0.05, 0.1, 0.2] | | max_depth | [3, 5, 7, 10] | | min_samples_leaf | [5, 10, 20, 50] | | subsample | [0.6, 0.8, 1.0] | Grid search: 4 x 4 x 4 x 4 x 3 = **768 combinations**. At 2 minutes per 5-fold CV, that is 2,560 minutes (~43 hours). Random search (100 iterations): Samples 100 random configurations. Research shows that with 100 random draws, you have a 95% probability of finding a configuration within the top 5% of all possibilities. Total time: ~333 minutes (~5.5 hours). Bayesian optimization (50 iterations): Uses a Gaussian process to model the relationship between hyperparameters and validation performance. Each iteration selects the most promising configuration based on an acquisition function (expected improvement). Often finds the global optimum in 30-50 evaluations. Total time: ~170 minutes (~2.8 hours). Results: | Method | Best Validation Sharpe | Evaluations | Time | |---|---|---|---| | Grid | 1.42 | 768 | 43 hrs | | Random | 1.39 | 100 | 5.5 hrs | | Bayesian | 1.44 | 50 | 2.8 hrs | **Avoiding Hyperparameter Overfitting:** - Use nested cross-validation: outer loop evaluates generalization, inner loop tunes hyperparameters - Hold out a final test set that is never used during tuning - Prefer simpler configurations when multiple produce similar scores Explore hyperparameter tuning scenarios in our CFA Quantitative Methods course.

What are the differences between grid search, random search, and Bayesian optimization for hyperparameter tuning in financial models?

Master Level II with our CFA Course

Related Questions

Practice Questions