How does principal component regression reduce dimensionality, and what are its limitations for prediction?

Question

AcadiFi · Accepted Answer

Principal component regression (PCR) addresses multicollinearity and high dimensionality by replacing the original correlated predictors with a smaller set of uncorrelated principal components. However, you correctly identify its fundamental limitation: the components that capture the most variance in X are not necessarily the most relevant for predicting Y. **PCR Algorithm:** ```mermaid graph TD A["Original Predictors
X₁, X₂, ..., X_p (correlated)"] --> B["PCA on X matrix"] B --> C["PC₁ explains 45% of X variance"] B --> D["PC₂ explains 25% of X variance"] B --> E["PC₃ explains 15% of X variance"] B --> F["PC₄...PC_p
remaining 15%"] C --> G["Keep top m components"] D --> G E --> G G --> H["Regress Y on PC₁, PC₂, PC₃"] H --> I["PCR Model
Uncorrelated regressors"] ``` **Worked Example:** Analyst Noemi at Whitfield Capital predicts monthly hedge fund returns using 25 correlated risk factors. OLS with all 25 is unstable (condition number > 500). PCA on the 25 factors extracts components: | Component | Variance Explained | Cumulative | Correlation with Y | |---|---|---|---| | PC1 | 38.2% | 38.2% | 0.12 | | PC2 | 18.7% | 56.9% | 0.41 | | PC3 | 11.3% | 68.2% | 0.05 | | PC4 | 8.1% | 76.3% | 0.38 | | PC5 | 5.4% | 81.7% | 0.02 | Using the standard rule (keep components explaining 80% of variance), Noemi selects PC1 through PC5. But PC1, PC3, and PC5 have almost no correlation with the target variable. Meanwhile, PC4 is highly predictive despite explaining only 8.1% of X-variance. PCR with 5 components: R-squared = 0.22 Using only PC2 and PC4: R-squared = 0.31 (better with fewer components) **The Fundamental Limitation:** PCA is an unsupervised technique -- it knows nothing about Y. The components maximizing X-variance may capture market-wide movements that explain predictor covariance but have no relation to the specific response. This is why partial least squares (PLS) was developed as a supervised alternative. **When PCR Works Well:** - When the high-variance components of X happen to also predict Y - When the primary goal is stabilizing predictions rather than maximizing R-squared - When dealing with near-singular X'X matrices where OLS fails entirely **When PCR Fails:** - When predictive signal lives in low-variance components - When interpretability of individual predictor effects is needed (components are linear combinations) - When a supervised method like PLS would better target the response Compare PCR with PLS in our CFA Quantitative Methods course.

How does principal component regression reduce dimensionality, and what are its limitations for prediction?

Master Level II with our CFA Course

Related Questions

Practice Questions