Advanced Econometrics - Part II Chapter 6: Models for count data Chapter MODELS FOR COUNT DATA A count variable is a variable that takes on non-negative integer values: • There is no natural upper bound • The outcome will be zero for at least some members of the population Y is count variable, X is a vector of explanatory variables It is better to model E (Y X ) directly and to choose functional forms that ensure possibility for any value of X and any parameter value When Y has no upper bound, the most popular of these is the exponential function E (Y X ) = exp( Xβ ) I POISSON REGRESSION MODEL: • The basic Poisson regression model assumes that Y given X = ( X , X , , X k ) has a Poisson distribution • The Poisson regression model specifies that each Yi is drawn from a Poisson distribution with parameter λi , which is related to the regressor X i Pr ob(Y = Yi X i ) = e − λi λYii (Yi ! = × × × Yi ) Yi ! λi and X i are related as: ln λi = X i β or λi = e X β i The expected number of events is given by: E[Yi X i ] = Var[Yi X i ] = λi = e X i β (Poisson distribution properties) So: • ∂E[Yi X i ] = λi β ∂X i With the parameter estimate in hand, this vector can be computed using any data vector desired • In principle, the Poisson model is simply a non-linear regression, but it is easier to estimate the parameters with maximum likelihood techniques The log-likelihood function is: Nam T Hoang UNE Business School University of New England Advanced Econometrics - Part II Chapter 6: Models for count data n n L == ln ∑ [-λi + Yi ( X i β ) − ln(Yi !)] = ∑ [-eXi β + Y ( X i β ) − ln(Yi !)] =i =i The likelihood equations are: n ∂ ln L = ∑ (Yi − λi ) X i = ∂β i =1 n = ∑ (Yi − e X i β ) X i = i =1 The Hessian is: n ∂ ln L = − λi X i X i' ∑ ' ∂β∂β i =1 The Hessian is negative definite for all X and β Newton-Raphson method is a simple n algorithm for this model and will converge rapidly At convergence ∑ λˆi X i X i' is an i=1 estimator of the asymptotic covariance matrix for β λˆi = exp( X i βˆ ) λˆi is the prediction for observation i → λˆi = exp( X i βˆ ) estimated variance of λˆi will be λˆi2 X i'VX i , where V is the estimated asymptotic covariance matrix for βˆ , n V = ∑ λˆi X i X i' i =1 II −1 GOODNESS OF FIT: Y − λˆ i i ∑ i =1 λˆi R2 = − n Yi − Y ∑ Y i =1 n This measure compares the fit of the model with a model of only one constant term ˆ Note: Yi is integer, the prediction λˆi = e X i β is continuous III OVERDISPERSION: Poisson model has been criticized because of its implicit assumption that the variance of Yi equals it's mean Many extensions of Poisson model that relax this assumption have been proposed Test for over dispersion: Nam T Hoang UNE Business School University of New England Advanced Econometrics - Part II Chapter 6: Models for count data H o : Var[Yi ] = E[Yi ] H A : Var[Yi ] = E[Yi ] + αg ( E[Yi ]) (Yi − λˆi ) − Yi Regress: Z i = λˆi Most of count model with overdispersion (variance exceeds the mean) specify overdispersion to be the form: Var[Yi X i ] = E[Yi ] + αg ( E[Yi ]) Where α is unknown parameter, g(.) is a known function most commonly g ( µ ) = µ , or g ( µ ) = µ Ho : α = Test: H A : α ≠ (or α > 0) Can be carried out by running the regression: (Yi − λˆi ) − Yi g (λˆi ) =α + ui λˆ λˆ i i Where ui is an error term The reported t-statistic for α is asymptotically normal under H o : α = (Cameron & Trivedi 1990) This test can be also used for underdispersion α < , in which case the conditional variance is less than the conditional mean Conditional mean & variance µi of the Poisson distribution, suppose now that the parameter is random rather than being a completely deterministic function of regressor Xi Let: µi = λi ui → ln µi = ln λi + ln ui = X i β + ε i Xiβ εi This distribution of Yi conditioned on X i and ui (ε i ) remain Poisson with conditional mean and variance µi : f (Yi X i , ui ) = e − λiui (λi ui )Yi Yi ! = Prob(Y Y= i X i , ui ) ∞ e − λiui (λi ui )Yi g (ui )dui Y ! i → Pr ob(Y = Yi X i ) = ∫ Nam T Hoang UNE Business School University of New England Advanced Econometrics - Part II Chapter 6: Models for count data g (ui ) is density function of ui The choice of g (ui ) defines the unconditional distribution For mathematical convenience, a gamma distribution is usually assumed for ui = (ε ε i ) Assume E (ui ) = (for E ( µi λi ) = λi ) → g (ui ) = θ θ −θu θ −1 e ui Γ(θ ) i This density function for Yi is then e − λi ui (λi ui )Yi θ θ uiθ −1e −θ ui dui Yi ! Γ(θ ) ∞ ∫ Prob( Y Y= f (Y= = i Xi ) i Xi ) λi Γ(θ + Yi ) r Yi (1 − ri )θ where ri = Γ(Yi + 1)Γ(θ ) λi + θ IV • NEGATIVE BINOMIAL REGRESSION MODEL: The assumed equality of the conditional mean and variance is the major shortenings of the Poisson model • We generalize the Poisson model by introducing an individual unobserved effect into the conditional mean • Suppose now that the conditional mean & variance µi of the Poisson distribution is random rather than being completely deterministic function of X (Because of unobserved heterogeneity different obs may have different µi λi is an parameter of Poisson but part of this difference is due to a random (unobserved) component ui not only because of X i µi is just a parameter of distribution we want E ( µi ) = e X i β we don’t want µ i = e X i β Let: µi = λi ui Where λi = e X i β → ln µi = ln λi + ln ui = X i β + ε i The disturbance ε i reflects cross-sectional heterogeneity that normally characterizes micro-economic data The distribution of Yi conditional on X i and ui remain Poisson with conditional mean & variance µi : = Prob(Y Y= i X i , ui ) Nam T Hoang UNE Business School e − λi ui (λi ui )Yi Yi ! = f (Yi X i , ui ) University of New England Advanced Econometrics - Part II Chapter 6: Models for count data The unconditional distribution Pr ob(Y = Yi X i ) is the expected value over ui of f (Yi X i , ui ) Prob( = Y Y= i Xi ) e − λi ui (λi ui )Yi g (ui )dui Yi ! ∞ ∫ g (ui ) is a density function of ui problem: the choice of g (ui ) ? For mathematical convenience, a gamma distribution is usually assumed for ui Assume E (ui ) = (for E ( µi λi ) = λi ) → g (ui ) = θ θ −θu θ −1 e ui Γ(θ ) i Then: = Prob( Y Y= i Xi ) = where ri = e − λi ui (λi ui )Yi θ θ uiθ −1e −θ ui dui ∫0 Yi ! Γ(θ ) ∞ θ θ λYi ∞ i i e Γ(Y + 1)Γ(θ ) ∫ −( λi +θ ) ui uiθ +Yi −1dui = θ θ λYi Γ(Yi + θ ) Γ(Yi + 1)Γ(θ )(λi + θ )θ +Y = Γ(Yi + θ ) riYi (1 − ri )θi Γ(Yi + 1)Γ(θ ) i i λi λi + θ This is the form of the negative binomial distribution the distribution has conditional 1 mean λi and conditional variance λi 1 + λi θ → E [Yi X i ] = λi = e X i β 1 Var [Yi X i ] = λi = λi 1 + λi θ ∞ P −1 − t Note: gamma function: Γ( P ) = ∫ t e dt We have: Γ( P) = ( P − 1)Γ( P − 1) → Γ( P) = ( P − 1)! if P is a integer number gamma function is a generalization of the factor function for non-integer values Nam T Hoang UNE Business School University of New England Advanced Econometrics - Part II Chapter 6: Models for count data which is the form of the negative binomial distribution E (Yi X i ) = λi 1 Var (Yi X i ) = λi 1 + λi θ Note: gamma distribution: f ( x) = Γ( P) e −λx x P−1 x ≥ 0, λ > 0, P > If E ( x) = → λ = P because E ( x) = P λP P λ V ( x) = λ2 E ( µi λi ) = λi if E (ui ) = the interpretation as in the Poisson model E (Yi X i ) = λi 1 V (Yi X i ) = λi (1 + λi ) θ This negative binomial model can be estimated by maximum likelihood without much difficulty A test of a the Poisson distribution is often carried out of testing the hypothesis α = 1θ = using the Wald test E (Yi X i ) = λi = e X i β 1 Var (Yi X i ) = λi 1 + λi θ λ The ratio of the variance to the mean now is 1 + i > , different for different θ observations The log-likelihood: L = ln = ri = λi n ∑ ln Γ(Y + θ ) − lnΓ(Y + 1) − ln Γ(θ ) + Y ln r + θ ln(1 − r ) i i =1 λi + θ i i i i ; λi = e X i β can be estimated by MLE easily Application: X it = (1, Age, Education, Income, Kids, Insurance) Doctor visits: count data models Nam T Hoang UNE Business School University of New England Advanced Econometrics - Part II V Chapter 6: Models for count data TOO MANY ZEROS DATA: In many data sets, there is large number of zero counts Assuming Poisson or negative binomial is then a misspecification Alternative is the zero-Inflated Poisson model • A binary probability model determines whether a zero or a nonzero outcome occurs then • A truncated Poisson distribution describes the positive outcomes Prob( = Yi 0= X i ) e −θ = Yi j= Xi ) Prob( (1 − e −θ )e − λi λi j j !(1 − e − λi ) Prob( = Z i 1= Wi ) F (Wi , γ ) Prob(Y= 1)= j X i , Z= i i e − λi λi j j! E (Yi X i ) = F × + (1 − F ) × E[Y * X i , Yi* > = = (1 − F ) λi − e −λi Where Y * denote the outcome of the Poisson process in the regime Prob(Yi =0 X i ) =Prob(regime 1) + Prob(Y =0 X i , regime 2) * Prob(regime 2) Prob( Yi j= X i ) Prob( Yi j X i , regime 2) * Prob(regime 2) = = Nam T Hoang UNE Business School University of New England ... Doctor visits: count data models Nam T Hoang UNE Business School University of New England Advanced Econometrics - Part II V Chapter 6: Models for count data TOO MANY ZEROS DATA: In many data sets,... factor function for non-integer values Nam T Hoang UNE Business School University of New England Advanced Econometrics - Part II Chapter 6: Models for count data which is the form of the negative... assumption have been proposed Test for over dispersion: Nam T Hoang UNE Business School University of New England Advanced Econometrics - Part II Chapter 6: Models for count data H o : Var[Yi ] = E[Yi