How is logistic regression used for predicting loan defaults, and how do you interpret the coefficients?
I'm studying quantitative methods for FRM Part I and see that logistic regression is a core tool for credit scoring. I understand linear regression well, but I'm confused about how logistic regression constrains the output to a probability between 0 and 1, and what the coefficients actually mean in terms of odds.
Logistic regression is the workhorse model for binary credit outcomes (default vs. no-default) because it maps any combination of inputs to a probability bounded between 0 and 1.
The Model
Instead of modeling the default probability directly, logistic regression models the log-odds (logit) as a linear function:
> ln(p / (1 - p)) = b0 + b1X1 + b2X2 + ... + bk*Xk
Where p is the probability of default. Solving for p:
> p = 1 / (1 + exp(-(b0 + b1X1 + ... + bkXk)))
This sigmoid function ensures p is always between 0 and 1, regardless of input values.
Worked Example: Crestline Bank's SME Portfolio
Crestline Bank builds a logistic regression model to predict 1-year default for small business loans using three variables:
| Variable | Coefficient | Interpretation |
|---|---|---|
| Intercept (b0) | -3.20 | Baseline log-odds when all Xs = 0 |
| Debt-to-Income (b1) | 0.045 | Each 1-unit DTI increase raises log-odds by 0.045 |
| Years in Business (b2) | -0.18 | Each additional year reduces log-odds by 0.18 |
| Delinquency Flag (b3) | 1.35 | Prior delinquency raises log-odds by 1.35 |
For a borrower with DTI = 42, 5 years in business, and a prior delinquency:
Logit = -3.20 + 0.045(42) + (-0.18)(5) + 1.35(1) = -3.20 + 1.89 - 0.90 + 1.35 = -0.86
p = 1 / (1 + exp(0.86)) = 1 / (1 + 2.363) = 0.297 or 29.7%
Interpreting Coefficients as Odds Ratios
Exponentiating a coefficient gives the odds ratio. For the delinquency flag: exp(1.35) = 3.86. This means a borrower with a prior delinquency has 3.86 times the odds of default compared to one without, holding other variables constant.
Model Assessment Metrics
- AUC-ROC: Measures discrimination — how well the model separates defaulters from non-defaulters. Values above 0.70 are acceptable; above 0.80 is strong.
- Hosmer-Lemeshow test: Checks calibration — whether predicted probabilities match observed default rates across deciles.
- KS statistic: Maximum separation between the cumulative distributions of scores for defaulters and non-defaulters.
FRM exam tip: Be comfortable converting between log-odds, odds, and probability. Also know that logistic regression assumes a linear relationship in the log-odds space — not in the probability space itself. Questions may test whether adding a variable improves the model using likelihood ratio tests.
Practice more credit modeling questions in our FRM question bank.
Master Part I with our FRM Course
64 lessons · 120+ hours· Expert instruction
Related Questions
How exactly do futures margin calls work, and what happens if I can't meet one?
How do you calculate the settlement amount on a Forward Rate Agreement (FRA)?
When should I use Monte Carlo simulation instead of parametric VaR, and how does it actually work?
Parametric VaR vs. Historical Simulation VaR — when does each method fail?
What are the core components of an Enterprise Risk Management (ERM) framework, and how does it differ from siloed risk management?
Join the Discussion
Ask questions and get expert answers.