A
AcadiFi
QD
QuantFinance_Dev2026-04-09
cfaLevel IIQuantitative MethodsMachine Learning

How do decision trees work for financial classification problems?

CFA Level II covers decision trees as a machine learning technique. I understand they split data into branches, but how exactly does the algorithm decide where to split? And what are the limitations?

121 upvotes
Verified ExpertVerified Expert
AcadiFi Certified Professional

Decision trees are intuitive classification (or regression) models that recursively partition data into increasingly homogeneous subsets. Think of it as a series of yes/no questions that narrows down to a prediction.

How splitting works:

At each node, the algorithm tests every possible split on every feature and chooses the split that produces the most homogeneous (pure) child nodes. Purity is measured by:

  1. Entropy: H = -Σ pᵢ log₂(pᵢ) — measures disorder. Lower = purer.
  2. Gini impurity: G = 1 - Σ pᵢ² — probability of misclassification. Lower = purer.
  3. Information gain: Reduction in entropy from a split.

Example — Credit Approval Tree:

Loading diagram...

The tree first splits on income (most informative), then on debt-to-income, then on credit score — at each step choosing the variable that best separates approvals from rejections.

Advantages:

  • Highly interpretable — you can explain the decision path
  • Handles non-linear relationships naturally
  • Works with both numerical and categorical features
  • No need to scale or normalize data

Disadvantages:

  • Prone to overfitting — deep trees memorize training data noise
  • Unstable — small data changes can produce completely different trees
  • Biased toward features with many levels — features with more unique values get more splitting opportunities
  • Typically lower accuracy than ensemble methods

Financial applications:

  • Credit approval/denial decisions
  • Fraud detection (transaction flagging)
  • Customer churn prediction (which clients will leave?)
  • Stock classification (buy/hold/sell based on fundamentals)

Controlling overfitting:

  • Set a maximum tree depth
  • Require a minimum number of samples per leaf
  • Prune branches that don't improve out-of-sample accuracy
  • Use ensemble methods (random forests) instead

Exam tip: CFA Level II tests the conceptual understanding — how trees split, why they overfit, and when ensembles are preferred. You won't need to calculate entropy by hand, but understand the concept of information gain.

Explore machine learning for finance on AcadiFi.

📊

Master Level II with our CFA Course

107 lessons · 200+ hours· Expert instruction

#decision-tree#classification#gini-impurity#entropy