Dinh & Kleimeier (2007) A Credit Scoring Model for Vietnam’s Retail Banking Market. International Review of Financial[r]
(1)LOGIT AND PROBIT
MODEL
(2)OLS AND RELATIONSHIP BETWEEN
VARIABLES
When
increases by unit, increases by units
y
x
y
x
(3)BINARY DEPENDENT VARIABLE
3
Sometimes the dep var under consideration is
binary:
Whether loan application is approved
Whether borrower can repay loan
Whether a person has credit card
(4)OLS WITH BINARY DEP VAR:
THE LINEAR PROBABILITY MODEL
4
If is a binary variable (0/1), and we apply OLS,
then the model is called Linear Probability Model
(LPM)
(5)Example: probability of default
5
Problem: how individual and loan characteristics affect
the probability of loan default
Data: 1.810 customers (borrowers) of a bank in VN
Dep var:
default
(= if there is one time (or more)
during the loan duration, the borrower is unable to
repay the installment within 90 days after due date,
otherwise 0)
Dep vars:
Income (
income
)
Ratio of collateral/loan amount (
coltoloan
)
(6)THE DATA
6
Total 1,810 100.00
395 21.82 100.00
1,415 78.18 78.18
default Freq Percent Cum.
tab default
(7)BIVARIATE ANALYSIS
7
income 395 21.82709 15.16845 1.5 100
Variable Obs Mean Std Dev Min Max
-> default = 1
income 1415 24.72685 20.98825 1.5 190
Variable Obs Mean Std Dev Min Max
-> default = 0
by default: sum income
(8)BIVARIATE ANALYSIS
8
coltoloan 395 1.54292 1.196532 9.7
Variable Obs Mean Std Dev Min Max
-> default = 1
coltoloan 1415 1.830526 2.627449 48.15789
Variable Obs Mean Std Dev Min Max
-> default = 0
(9)LINEAR PROBABILITY MODEL
9
(10)LINEAR PROBABILITY MODEL
10
0
.2
.4
.6
.8
1
0 50 100 150 200
income
(11)DISADVANTAGES OF LPM
11
Assume that linearly correlate with
regardless the initial value of
Fitted value of
may be out of [0,1]
Violate the assumption that is normally
distributed
has unequal variance, resulting in
unreliability of hypothesis testing
1
Pr y
X
X
1
(12)THE LOGIT MODEL
12
Assume the probability of loan default only depends on an
index I
*
i
which is unobservable, anh this index is a function of
regressors:
Assume that:
Y
i
= (default)
if I
*
I
Y
i
= (otherwise)
if I
*
i
<
The probability of default is then:
If this probability is symmetric, then :
i
i
i
X
u
I
*
)
(
)
0
(
)
0
(
)
1
(
Y
i
P
I
i
*
P
X
i
u
i
P
u
i
X
i
P
)
(
)
1
(
i
i
i
i
P
Y
P
u
X
(13)THE LOGIT MODEL
13
Logit model assume u
i
follows logistic distribution
Probability of default:
Probability of non-default:
With , then
1
(
)
1
ii
i
i
Z
P u
X
P
e
i
i
i
X
u
Z
i
Z
i
e
P
1
1
1
(14)THE LOGIT MODEL
14
The odd ratio in this case is the ratio between
probability of default and probability of non-default:
Taking log of both sides, we otain the logit:
LPM assumes P
i
linearly correlates with X
i
, the Logit
model assumes the logit linearly correlates with X
i
i i i
Z
Z
Z
i
i
e
e
e
P
P
1
1
1
i
i
i
i
i
X
u
P
P
L
(15)PROPERTIES OF LOGIT MODEL
15
P
i
varies from to while the logit L
i
varies from –
to
+
Although L
i
is a linear function of X
i
, the probability is
not
Interpretation of estimated coefficients:
j
is the change
in log-odd ratio when x
j
increase by unit
Once obtaining β, we can predict the odd-ratio ad the
probability of default P
i
In LPM, the marginal effect of x
j
is constant In the Logit
(16)ESTIMATION METHOD
16
Maximum Likelihood (ML)
ML seeks
j
such that logL is maximized
1
log
1
1
n
i
i
i
i
i
L
Y P
Y
P
iZ
i
e
P
1
1
i
i
i
X
u
(17)ESTIMATE LOGIT MODEL IN STATA
17
_cons -1.019157 .097014 -10.51 0.000 -1.209301 -.8290129 coltoloan -.0547242 .0307794 -1.78 0.075 -.1150508 .0056024 income -.0071093 .003187 -2.23 0.026 -.0133557 -.0008628 default Coef Std Err z P>|z| [95% Conf Interval] Log likelihood = -944.24413 Pseudo R2 = 0.0057 Prob > chi2 = 0.0045 LR chi2(2) = 10.79 Logistic regression Number of obs = 1810 Iteration 3: log likelihood = -944.24413
(18)INTERPRETATION OF COEFFICIENT
18
Coefficient
Coefficient
only indicates the direction of the effect
of on It says nothing about the magnitude of
the effect
i
i
i
i
i
X
u
P
P
L
1
ln
iZ
i
e
P
1
1
i
i
i
X
u
Z
(19)MARGINAL EFFECTS
19
If we want to know: when increases by unit,
then how much changes (marginal effect)
Marginal effect in the logit model is not constant It
varies with
X
P
1
1
i ii
X
u
i
i
P
X
X
e
(20)MARGINAL EFFECTS IN STATA
20
coltol~n -.0092774 .0052 -1.78 0.075 -.019476 000921 1.76776 income -.0012052 00054 -2.24 0.025 -.002261 -.000149 24.094 variable dy/dx Std Err z P>|z| [ 95% C.I ] X = 21632929
y = Pr(default) (predict) Marginal effects after logit mfx
coltol~n -.0091914 .0055 -1.67 0.095 -.019971 001588 income -.0011941 00049 -2.42 0.015 -.002161 -.000228 40 variable dy/dx Std Err z P>|z| [ 95% C.I ] X = .2135719
y = Pr(default) (predict) Marginal effects after logit
mfx, at(income = 40 coltoloan = 0)
Marginal effects at
mean of X
Marginal effects at income
of 40 mil VND and no
(21)HYPOTHESIS TESTING
21
To test whether
income
affects
default
To test whether
income
and
coltoloan
affect
default
simultaneously
Prob > chi2 = 0.0257
chi2( 1) = 4.98
( 1) [default]income = 0
test income
(22)THE PROBIT MODEL
22
In the Logit model, u follows logistic distribution
In the Probit model, u follows normal distribution
where F is the cumulative distribution function (CDF) of the
normal distribution:
(
i
i
)
(
i
)
P u
X
F
X
iX
z
i
e
dz
X
F
2/
2
2
1
)
(
1
(
)
1
ii
i
i
Z
P u
X
P
e
(23)PROBIT MODEL IN STATA
23
_cons -.6222283 .0573382 -10.85 0.000 -.7346092 -.5098474 coltoloan -.0339395 .0179847 -1.89 0.059 -.0691889 .0013098 income -.0042379 .0018321 -2.31 0.021 -.0078288 -.0006471 default Coef Std Err z P>|z| [95% Conf Interval] Log likelihood = -943.93571 Pseudo R2 = 0.0060 Prob > chi2 = 0.0033 LR chi2(2) = 11.40 Probit regression Number of obs = 1810 Iteration 3: log likelihood = -943.93571
(24)LOGIT OR PROBIT
24
P
i
approaches and slower in the Logit,
compared to the Probit model
No obvious reason of choosing between the
two models
However Logit is preferred for its simplicity in
(25)APPLICATION OF LOGIT/PROBIT
25
Dinh & Kleimeier (2007) A Credit Scoring Model for Vietnam’s
Retail Banking Market International Review of Financial
Analysis 16: 471-95
Analyzes the probability of default, similar to the
(26)APPLICATION OF LOGIT/PROBIT
26
Dymski & Mohanty (1999) Credit and Banking Structure: Asian
and African-American Experience in LA American Economic
Review 89(2): 362-6
Analyze the discrimination in approving house
purchasing loan application
Dep var: Whether the application (for house
purchasing loan) is approved (1) or not (0)
Regressors: borrower’s characteristics (race,
(27)APPLICATION OF LOGIT/PROBIT
27