A
AcadiFi
WC
WealthMgmt_CFA2026-04-12
cfaLevel IIIAsset AllocationCapital Market Expectations

How do I apply the 'correlation does not imply causation' principle when selecting CME forecasting variables?

This seems obvious in theory, but in practice I find it hard to distinguish. If a variable has strong correlation with future returns AND I can construct an economic story, is that enough? The curriculum warns about inventing the story after finding the correlation.

95 upvotes
Verified ExpertVerified Expert
AcadiFi Certified Professional

You've identified one of the trickiest judgment calls in CME development. The 'no story, no future' heuristic is necessary but not sufficient — you also need to guard against reverse-engineering the narrative.

The Spectrum of Variable Quality:

Loading diagram...

Example — Ridgeline Capital's Variable Selection:

Ridgeline's team is building a multi-factor CME model for developed market equities. They evaluate four candidate variables:

Variable 1: Earnings Yield (E/P ratio)

  • Correlation with 10-year forward returns: 0.71
  • Economic rationale: Higher earnings yield means you're paying less per unit of earnings, mechanically implying higher expected returns
  • Verdict: STRONG — theory predates the data, causal mechanism is clear

Variable 2: Real Money Supply Growth (M2)

  • Correlation with 1-year forward returns: 0.45
  • Economic rationale: Monetary expansion increases liquidity, lowers discount rates, and boosts asset prices
  • Verdict: ACCEPTABLE — established macro theory supports this; the channel is well-documented

Variable 3: Average CEO Confidence Index

  • Correlation with 6-month forward returns: -0.52
  • Economic rationale (constructed after): 'When CEOs are overconfident, they overinvest, destroying value, so high confidence predicts poor returns'
  • Verdict: SUSPECT — the story was invented to fit the finding. CEO confidence could just as easily predict positive returns through investment-led growth. The narrative is post-hoc.

Variable 4: Annual Sunspot Count

  • Correlation with equity returns: 0.38 (over certain periods)
  • Economic rationale: None
  • Verdict: REJECT — classic spurious correlation, no causal mechanism

Red Flags for Post-Hoc Rationalization:

  1. The story could explain the opposite sign equally well ('high confidence could mean good or bad things')
  2. The economic mechanism requires multiple untested intermediate steps
  3. The story relies on behavioral assumptions that aren't well-established
  4. You wouldn't have predicted this variable ex ante if someone asked you to list promising CME inputs

Practical Test: Before looking at the data, write down which variables you expect to matter and what sign their coefficients should have. Then test. Variables that match your priors in both direction and approximate magnitude are far more credible than surprises.

For more on variable selection in CME, explore our CFA Level III community Q&A.

📊

Master Level III with our CFA Course

107 lessons · 200+ hours· Expert instruction

#correlation-causation#variable-selection#data-mining-bias#economic-rationale#cme-challenges