When two variables are correlated, how do I determine which one is actually predictive? The curriculum says there are four possible explanations.
I'm studying CME biases for CFA Level III and the section on correlation misinterpretation lists four possibilities when A and B are correlated. I understand them in theory, but how do I actually figure out which explanation applies in practice?
This is one of the most important conceptual traps in CME development. When you observe a significant correlation between variable A and variable B, the data alone cannot distinguish among four very different realities:
Practical Example — Harborview Research:
Harborview discovers a strong correlation (r = 0.72) between US construction spending and Australian equity returns over the past 15 years. Four possible interpretations:
1. US construction → Australian equities (A predicts B):
Perhaps US construction booms signal global economic expansion, which benefits Australia's export-driven economy. Plausible but indirect.
2. Australian equities → US construction (B predicts A):
Perhaps rising Australian equities signal a strong commodities cycle (iron ore, copper), and commodity wealth flows into US real estate. Possible but unlikely as the primary channel.
3. Third variable C drives both:
This is most likely. Chinese economic growth simultaneously drives Australian commodity exports (lifting Australian equities) AND stimulates global construction demand (including US). The true predictor is Chinese demand, not either observed variable directly.
4. Spurious correlation:
The relationship may reflect coincidental trends over 15 years (both happened to grow during the same global expansion). Out-of-sample testing would reveal whether the relationship has any durability.
How to Investigate:
- Temporal ordering (Granger causality tests): Does A lead B in time, or does B lead A? If neither leads, a third variable or spurious relationship is more likely.
- Control for candidate third variables: If adding Chinese GDP growth to the regression eliminates the correlation between A and B, then C (China) was driving both.
- Economic mechanism: Map out the causal chain. Is there a plausible direct mechanism from A to B? How many intermediate steps are required?
- Out-of-sample testing: Spurious correlations collapse out of sample. Genuine relationships (whether direct or through C) persist.
- Natural experiments: Look for periods where A changed for reasons unrelated to B. Did B still respond?
The Nonlinear Trap — Don't Dismiss Low Correlations Too Quickly:
The curriculum also warns about the opposite mistake: concluding that no relationship exists because the linear correlation is low. Consider the VIX index and S&P 500 returns. The Pearson correlation might be modest (around -0.3 to -0.4), but the relationship is strongly nonlinear — the VIX barely moves when markets rise slowly, but spikes dramatically during selloffs. A negligible linear correlation can mask a powerful nonlinear relationship.
Key Exam Takeaway: Never use a correlation in a predictive model without investigating the underlying causal structure. The observed statistic is the beginning of the analysis, not the end.
Explore more correlation pitfalls in our CFA Level III question bank.
Master Level III with our CFA Course
107 lessons · 200+ hours· Expert instruction
Related Questions
What exactly is the Capital Market Expectations (CME) framework and why does it matter for asset allocation?
How do business cycle phases affect asset class return expectations?
Can someone explain the Grinold–Kroner model step by step with numbers?
How do you forecast fixed-income returns using the building-blocks approach?
PPP vs Interest Rate Parity for forecasting exchange rates — when do I use which?
Join the Discussion
Ask questions and get expert answers.