How does partial least squares differ from PCR by incorporating the response variable into component extraction?

Question

AcadiFi · Accepted Answer

Partial least squares (PLS) constructs latent components that simultaneously capture variance in X and maximize covariance with Y. Unlike PCR, which blindly finds the directions of greatest spread in the predictors, PLS finds directions that are both informative about X and predictive of Y.

**PLS vs. PCR:**

- PCR maximizes: Var(X x w) where w is the component weight vector
- PLS maximizes: Cov(Y, X x w)^2 = [Cor(Y, X x w)]^2 x Var(X x w) x Var(Y)

PLS balances both high X-variance and high Y-correlation. A direction in X-space that explains modest X-variance but strongly predicts Y will be favored over a high-variance direction uncorrelated with Y.

**Simplified PLS Algorithm:**

1. Compute weights w_1 by regressing each column of X on Y, then normalizing the weight vector
2. Extract first PLS component: T_1 = X x w_1
3. Regress Y on T_1 and store the coefficient
4. Deflate X by removing the projection onto T_1
5. Repeat for additional components using deflated X
6. Choose the number of components via cross-validation

**Worked Example:**

Researcher Callum at Maplethorn Analytics predicts corporate bond excess returns using 18 financial and macroeconomic variables. With only 60 monthly observations, p/n = 0.30 creates instability.

Comparing approaches (5-fold cross-validation RMSE in basis points):

| Method | Components/Variables | CV RMSE |
|---|---|---|
| OLS (all 18) | 18 | 142 bps |
| PCR (5 components) | 5 | 108 bps |
| PCR (3 components) | 3 | 115 bps |
| PLS (3 components) | 3 | 89 bps |
| PLS (2 components) | 2 | 84 bps |

PLS with just 2 components outperforms PCR with 5 because those 2 PLS components are specifically constructed to predict Y. The first PLS component captures the combination of credit spread, term slope, and equity volatility that drives bond returns, even though these individually explain less total X-variance than the market-wide factor captured by PC1.

**Advantages of PLS:**
- Works well with many predictors relative to observations (high p/n ratio)
- Produces fewer components needed for good prediction
- Handles multicollinearity without discarding predictive signal
- Components remain interpretable as weighted combinations of original variables

**Limitations:**
- Can overfit if too many components are retained (always use cross-validation)
- Less mathematically elegant than PCR (no clean eigendecomposition)
- Not as widely implemented in basic statistical software

**CFA Exam Comparison:**
- PCR: unsupervised dimensionality reduction, then regression
- PLS: supervised dimensionality reduction, targets prediction
- Both solve multicollinearity; PLS usually needs fewer components

Explore advanced regression methods in our CFA Quantitative Methods course.

How does partial least squares differ from PCR by incorporating the response variable into component extraction?

Master Level II with our CFA Course

Related Questions

Practice Questions