Under Poisson distribution, mean = variance This means variance increases with mean.. If not.[r]
(1)COUNT DATA MODELS
(2)Count data and Poisson distribution Poisson model
Negative Binomial model Application of count models
(3)Count data and Poisson distribution
Sometimes the dep var is a non-negative integer
Example:
takeover bids received by a target firm number of unpaid credit installment number of accidents
number of prepaid mortgage loans
These are the number of incidence happened in a
given period of time
0,1, 2, ,
(4)Count data and Poisson distribution
For dependent variable that is an non-negative
integer
it is an integer
(5)Count data and Poisson distribution
Poisson distribution describe the probability of
events occurring k times in a given period of time
The probability function is
(6)(7)(8)(9)(10)Properties of Poisson distribution
One parameter
Mean is equal to variance
Thus, variance increases with mean
If we model lambda as a function of explanatory
variable, we have the Poisson model
var
(11)Poisson model
Mean
Probability
Log-likelihood function
Estimation method: Maximum Likelihood
| Xi
i
E y X e
Pr ! ! Xi i k X e
k e e
e y k k k 1
log i log !
N
X
i i i
i
L e y X y
(12)Interpreting Poisson model estimates
We are interested in: when X changes, how the
expected value of y changes
The marginal effect
| Xi
i
E y X e
|
i
i X
i
E y X
e X
(13)Issue in Poisson model
Under Poisson distribution, mean = variance This means variance increases with mean
If not
variance increase at a LOWER rate than mean:
UNDERDISPERSION
variance increase at a HIGHER rate than mean:
(14)Negative Binomial Model
If we model
Then the mean is And the variance
This is negative binomial model
Note if then the model collapse to Poisson.
Pr ! k e y k k
| Xi
i
E y X e
2
var y Xi |
0
(15)Analyze the number of non-payments during a credit contract
(16)Data
Dependent variable: number of nonpayment
during a credit contract
Independent variables
duration: contracting period (month) age: (year)
collateral: dummy, = with collateral edu: schooling years (years)
banking: dummy, = receiving salary via bank account salary: monthly income (mil VND)
(17)The number of nonpayments
Total 2,110 100.00
(18)Poisson model in Stata
_cons 2.074566 .1463598 14.17 0.000 1.787706 2.361426 married 459314 .033624 13.66 0.000 3934121 .5252159 salary 0014747 .0045834 0.32 0.748 -.0075086 .0104581 banking -4.108164 .1516297 -27.09 0.000 -4.405353 -3.810975 edu -.0899637 .0048852 -18.42 0.000 -.0995387 -.0803888 collateral -1.540261 .0434424 -35.46 0.000 -1.625407 -1.455116 age -.035701 .0028549 -12.51 0.000 -.0412964 -.0301056 duration 0599389 .0031207 19.21 0.000 0538225 .0660554 nonpay Coef Std Err z P>|z| [95% Conf Interval] Log likelihood = -1985.4326 Pseudo R2 = 0.6086 Prob > chi2 = 0.0000 LR chi2(7) = 6173.59 Poisson regression Number of obs = 2110 Iteration 3: log likelihood = -1985.4326
Iteration 2: log likelihood = -1985.4327 Iteration 1: log likelihood = -1985.7743 Iteration 0: log likelihood = -2066.391
(19)Hypothesis testing
H0: coef of all demographic variables equal zero
Prob > chi2 = 0.0000 chi2( 3) = 697.41 ( 3) [nonpay]married = 0
(20)Prediction
npay 2110 1.891943 2.583285 .0035831 21.4108 Variable Obs Mean Std Dev Min Max sum npay
(21)Marginal effects
(*) dy/dx is for discrete change of dummy variable from to
married* .1951712 01804 10.82 0.000 .15982 230522 .555924 salary 0006373 00198 0.32 0.748 -.003245 .00452 11.9332 banking* -2.093989 04643 -45.10 0.000 -2.185 -2.00298 .388152 edu -.0388793 00305 -12.75 0.000 -.044856 -.032902 11.9995 collat~l* -.6827921 04277 -15.96 0.000 -.766627 -.598957 .453555 age -.0154288 00151 -10.19 0.000 -.018397 -.01246 34.9464 duration 0259036 .002 12.93 0.000 .021976 029832 23.9223 variable dy/dx Std Err z P>|z| [ 95% C.I ] X = 43216618
y = predicted number of events (predict) Marginal effects after poisson
(22)Marginal effects at a value point
(*) dy/dx is for discrete change of dummy variable from to
married* .0236672 00443 5.34 0.000 .014985 .03235 salary 0000948 .0003 0.31 0.754 -.000498 000688 30 banking* -3.845207 33501 -11.48 0.000 -4.50182 -3.18859 edu -.0057814 00102 -5.68 0.000 -.007777 -.003786 16 collat~l* -.0504903 00878 -5.75 0.000 -.067693 -.033288 age -.0022943 00044 -5.21 0.000 -.003158 -.001431 34 duration 0038519 00069 5.59 0.000 .002501 005203 24 variable dy/dx Std Err z P>|z| [ 95% C.I ] X = .0642636
y = predicted number of events (predict) Marginal effects after poisson
(23)Marginal effects at a value point
(*) dy/dx is for discrete change of dummy variable from to
married* .1960826 01812 10.82 0.000 .160569 231597 .555924 salary 0006403 00199 0.32 0.748 -.00326 004541 11.9332 banking* -2.103768 04655 -45.19 0.000 -2.19501 -2.01252 .388152 edu -.0390608 00306 -12.75 0.000 -.045065 -.033057 11.9995 collat~l* -.6859805 04296 -15.97 0.000 -.770188 -.601773 .453555 age -.0155008 00152 -10.19 0.000 -.018483 -.012519 34.9464 duration 0260245 00202 12.90 0.000 .022071 029978 24 variable dy/dx Std Err z P>|z| [ 95% C.I ] X = 43418424
y = predicted number of events (predict) Marginal effects after poisson
means used for age collateral edu banking salary married
(24)Negative binomial model
Likelihood-ratio test of alpha=0: chibar2(01) = 3.5e-05 Prob>=chibar2 = 0.498 alpha 1.21e-08 1.89e-06 5.9e-141 2.5e+124 /lnalpha -18.2266 155.4434 -322.8902 286.4369 _cons 2.074564 .1463599 14.17 0.000 1.787704 2.361424 married 4592988 .033624 13.66 0.000 393397 .5252006 salary 0014748 .0045834 0.32 0.748 -.0075086 .0104581 banking -4.108135 .1516278 -27.09 0.000 -4.40532 -3.81095 edu -.0899611 .0048852 -18.41 0.000 -.099536 -.0803861 collateral -1.540237 .0434421 -35.45 0.000 -1.625382 -1.455092 age -.0356999 .0028549 -12.50 0.000 -.0412953 -.0301044 duration 0599365 .0031207 19.21 0.000 .05382 .0660529 nonpay Coef Std Err z P>|z| [95% Conf Interval] Log likelihood = -1985.4326 Pseudo R2 = 0.4782 Dispersion = mean Prob > chi2 = 0.0000 LR chi2(7) = 3639.28 Negative binomial regression Number of obs = 2110
(25)Application of Count Models
Dione&Vanasse (1992) Automobile Insurance Ratemaking in the Presence of Asymmetrical
Information J of Applied Econometrics 7: 149-65.
Quebec drivers 1982-83, for insurers to classify
drivers.
dep var: number of accidents reported by police indep var: driver’s characteristics [age, gender,
(26)Application of Count Models
Greene (1994) Accounting for Excess Zeros and Sample
Selection in Poisson and Negative Binomial Regression Models Working Paper, New York University.
data of credit card applicants
dep var: number of derogatory reports indep var:
income, expenditure age
(27)Application of Count Models
Dione et al (1996) Count Data Models for a Credit
Scoring System J of Empirical Finance 3: 303-25.
data of 4,700 clients granted credit by a bank
dep var: number of unpaid monthly payments during
the contracting period
indep var:
income age
duration of contracting period marital status
(28)Application of Count Models
Jaggia&Thosar (1993) Multiple Bids as a Consequence of Target
Management Resistance : A Count Data Approach Review of
Qualitative Finance and Accounting 3: 447-57.
data of 126 firms that were targets of tender offers 1978-85 dep var: number of bids after the initial bid received
indep var:
legal defense
real restructuring
financial restructuring
white knight
initial bid premium
institutional holdings
size