Advanced Econometrics - Part II Chapter 1: Review Of Least Squares & Likelihood Methods Chapter REVIEW OF LEAST SQUARES & LIKELIHOOD METHODS I LEAST QUARES METHODS: Model: - We have N observations (individuals, forms, …) drawn randomly from a large population i = 1, 2, …, N - On observation i: Yi and K-dimensional column vector of explanatory variables X i = ( X i1 , X i , , X ik ) and assume X ie = for all i = 1, 2, …, N - We are interested in explaining the distribution of Yi in terms of the explanatory variables X i using linear model: Yi = β ' X i + ε i ( β = ( β1 , , β k )) In matrix notation: Y = Xβ + ε Yi = β1 + β X i + β3 X i + + β k X ik + ε i Assumption 1: {X i , Yi }in=1 are independent and identically distributed Assumption 2: ε i ׀X i ~ N ( 0, σ2 ) Assumption 3: ε i ⊥ X i ( ∑ ε i X ij = ) n i =1 Assumption 4: E[ ε i ׀X i ] = Assumption 5: E[ ε i * X i ] = Nam T Hoang UNE Business School University of New England Advanced Econometrics - Part II Chapter 1: Review Of Least Squares & Likelihood Methods The Ordinary Least Squares (OLS) estimator for β solves: n ∑ (Yi − β ' X i ) β i =1 −1 n n This leads to: βˆ = ∑ X i X 'i ∑ X iYi = ( X ' X ) −1 ( X ' Y ) i =1 i =1 The exact distribution of the OLS estimation under the normality assumption is: βˆ ~ N [ β ,σ ( X ' X ) −1 ] • Without the normality of the ε it is difficult to derive the exact distribution of βˆ However we can establish asymptotic distribution: d N ( βˆ − β ) → N (0,σ E[ XX ' ]−1 ) • We not know σ , we can consistently estimate it as σˆ = • n (Yi − βˆ ' X i ) ∑ n − k − i =1 In practice, whether we have exact normality for the error terms or not, we will use the following distribution for βˆ : βˆ ≈ N ( β ,V ) −1 where: V = σ ( E ( X ' X )) N estimate V by: Vˆ = σˆ (∑ X i X 'i ) −1 i =1 • If we are interested in a specific coefficient: βˆk ≈ N ( βˆk ,Vˆkk ) Vˆij is the (i,j) element of the matrix Vˆ • Confidence intervals for β k would be (95%) βˆ − 1.96 Vˆ ; βˆ + 1.96 Vˆ kk k kk k • Test a hypothesis whether β k = α Nam T Hoang UNE Business School University of New England Advanced Econometrics - Part II t= Chapter 1: Review Of Least Squares & Likelihood Methods βk −α Vˆkk ~ N (0,1) Robust Variances: If we don’t have the homoscedasticity assumption then: n ( βˆ − β ) → N (0, (E[XX' ]) -1 (E[ε XX' ]) (E[XX' ]) -1 We can estimate the heteroskedasticity – consistent variance as: (White’s estimator) 1 Vˆ = N −1 1 X i X 'i ∑ i =1 N N εˆ X i X 'i ∑ i =1 N N X i X 'i ∑ i =1 N −1 II MAXIMUM LIKELIHOOD ESTIMATION: Introduction: • Linear regression model: Yi = X 'i β + ε i with ε׀Xi ~ N ( 0, σ2 ) n βˆ arg ∑ (Yi − X 'i β ) OLS: = β i =1 −1 n n → βˆ = ∑ X i X 'i ∑ X iYi i =1 i =1 • Maximum likelihood estimator: ( βˆ , σˆ MLE ) = arg max2 L( β , σ ) β ,σ Where: n L( β ,σ ) = ∑ − ln(2πσ ) − (Yi − X 'i β ) 2σ i =1 n n = − ln(2πσ ) − (Yi − X 'i β ) 2 ∑ 2σ i =1 Note: X ~ N( µ ,σ ) → density function of X: f ( X ) = Nam T Hoang UNE Business School 2πσ e − ( X − µ )2 2σ University of New England Advanced Econometrics - Part II • Chapter 1: Review Of Least Squares & Likelihood Methods This lead to the same estimator for β as in OLS and the MLE approach is a systematic way to deal with complex nonlinear model σˆ MLE = n (Yi − X 'i β ) ∑ n i =1 Likelihood function: • Suppose we have independent and identically distributed random variables Z1 , , Z n with common density f ( Z i , θ ) The likelihood function given a sample Z1 , Z , , Z n is n (θ ) = ∏ f ( Z i , θ ) i =1 • The log – likelihood function: = L(θ ) ln= (θ ) n ∑ ln f (Z ,θ ) i =1 i • Building a likelihood function is first step to job search theory model • An example of maximum likelihood function: An unemployed individual is assumed to receive job offers Arriving according to rate λ such that the expected number of job offers arriving in a short interval of length dt is λdt Each offer consist of some wage rate w, draw independently of previous wages, with continuous distribution function Fw (w) If the offer is better than the reservation wage w , that is with probability − F ( w ) , the offer is accepted The reservation wage is set to maximize utility Suppose that the arrival rate is constant over time Optimal reservation wage is also constant over time The probability of receiving an acceptable offer in a short time dt is θ dt with θ = λ.(1 − F ( w )) Nam T Hoang UNE Business School University of New England Advanced Econometrics - Part II Chapter 1: Review Of Least Squares & Likelihood Methods The constant acceptance rate θ implies that the distribution for the unemployment duration is exponential with mean θ and density function: → f ( y ) = θe ( − yθ ) y: unemployment duration - random variable 1 , Mean & variance S ( y ) = − F ( y ) = e ( − yθ ) : survivor function h( y ) = θ θ fY ( y) Pr( y < Y < y + dy ) = lim = θ : hazard function S ( y ) dy →0 P( y < Y ) (The rate at which a job is offered and accepted) Likelihood function: a) If we observe the exact unemployment duration yi n n , θ ) ∏ h( y ∏ f(y = (θ ) → = i =i =i i θ )S( yi θ ) b) We observe a number of people all becoming unemployed at the same point in time, but we only observe whether they exited unemployment before a fixed point in time, say c: (θ = ) n ∏ F(c θ )di (1 − F (c θ ))1-di= n ∏ (1-S(c θ )) di S (c θ )1-di =i =i di = denotes that individual i left unemployment before c and di = to denote this individual was still unemployed at time c c) If we observe the exact exit or failure time if it occurs before c, but only an indicator of exit occurs after c n di 1-di i =i =i = (θ ) n = f(y , θ ) S (c θ ) ∏ ∏ h( y i θ )di S( yi θ )di S (c θ )1-di d) Denote ci is the specific censoring time of individual i Letting t denote the minimum of the exit time yi and censoring time ci , t i = min( yi , ci ) n n di 1-di di 1-di i i i i =i =i =i = (θ ) Nam T Hoang UNE Business School n f(y , θ ) S (c θ ) = = f (t θ ) S(t θ ) ∏ ∏ ∏ h(t i θ ) di S (ti θ ) University of New England Advanced Econometrics - Part II Chapter 1: Review Of Least Squares & Likelihood Methods Properties of MLE n θˆMLE = arg max ∑ ln f ( Z i ,θ ) θ →Θ i =1 a Consistency: For all ε > lim Pr( θˆMLE − θ > ε ) = n →∞ b Asymptotic normality: n (θˆ MLE −1 ∂2 L ( Zi ,θ0 ) − θ ) → N 0, − E ∂θ∂θ ' Computation of the maximum likelihood estimator: Newton – Raphson method: • Approximate the objective function Q(θ ) = − L(θ ) around some starting value θ by a quadrate function and find the exact minimum for that quadrate approximation Call this θ1 • Redo the quadrate Nam T Hoang UNE Business School University of New England ...Advanced Econometrics - Part II Chapter 1: Review Of Least Squares & Likelihood Methods The Ordinary Least Squares (OLS) estimator for β solves: n ∑ (Yi − β ' X i )... function of X: f ( X ) = Nam T Hoang UNE Business School 2πσ e − ( X − µ )2 2σ University of New England Advanced Econometrics - Part II • Chapter 1: Review Of Least Squares & Likelihood Methods. .. ∏ ∏ h(t i θ ) di S (ti θ ) University of New England Advanced Econometrics - Part II Chapter 1: Review Of Least Squares & Likelihood Methods Properties of MLE n θˆMLE = arg max ∑ ln f ( Z i ,θ