Chapter 01_Classical Linear Regression tài liệu, giáo án, bài giảng , luận văn, luận án, đồ án, bài tập lớn về tất cả cá...
Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression Chapter 1: CLASSICAL LINEAR REGRESSION I MODEL: Y = f ( X , X , , X k ) + ε Population model: Dependent variable Explanatory variable or Regressor Disturbance (error) - f may be any kind (linear, non-linear, parametric, non-parametric, ) - We'll focus on: parametric and linear in the parameters Sample information: - We have a sample: {Yi , X i , X , , X ik }in=1 - Assume that these observed values are generated by the population model: Yi = β1 + β X 2i + β X 3i + + β k X ki + ε i - Objectives: i Estimate unknown parameters ii Test hypotheses about parameters iii Predict values of y outside sample - Note that: β k = ∂Yi , so the parameters are the marginal effect of the X's on Y, ∂X ki with other factors held constant EX: Ci = β1 + β 2Yi + ε i Nam T Hoang University of New England - Australia University of Economics - HCMC - Vietnam Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression βk = ∂Ci = M P.C ⇒ require ≤ ∂Yi ≤1 Denotes: Y= Y1 1 X 12 Y 1 X 22 2 ; X = 1 X n Yn X 13 X 23 X n3 X 1k β1 β X k ; = and ε = X nk β n ε ε 2 ε n ⇒ We have: Y = X β + ε ( n×1) ( n×k ) ( k ×1) ( n×1) II ASSUMPTIONS OF THE CLASSICAL REGRESSION MODEL: Models are simplications of reality We'll make a set of simplifying assumptions for the model The assumptions relate to: - Functional form - Regressors - Disturbances Assumption 1: Linearity The model is linear in the parameters Y = X + ε Assumption 2: Full rank Xijs are not random variables - or Xijs are random variables that are uncorrelated with ε There is no exact linear dependencies among the columns of X This assumption will be necessary for estimation of the parameters (need no EXACT) X ; ( n ×k ) Rank(X) = k implies n > k as Rank (A) ≤ (Rows, Columns) Rank(X) = n is also OK Nam T Hoang University of New England - Australia University of Economics - HCMC - Vietnam Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression Assumption 3: Exogeneity of the independent variables E[ε i X j1 X j , X j , , X jk ] = i = j also i ≠j This means that the independent variables will not carry useful information for prediction of εi E[ε i X ] = ∀i = 1, n → E (ε i ) = Assumption 4: Var (ε i ) = σ i = 1, n Cov(ε i , ε j ) = ∀i ≠ j z1 z For any random vector Z = , we can express its variance zn covariance matrix as: VarCov( Z ) = E [( Z − E ( Z ))( Z − E ( Z ))' ] = z1 − E ( z1 ) z − E ( z ) [( z1 − E ( z1 )) ( z − E ( z )) ( z m − E ( z m )) = E 1× m z m − E ( z m ) m ×1 jth diagonal element is var(zj) = σjj = σj2 ijth element (i ≠ j) is cov(zi,zj) = σij E[( z1 − E ( z1 )) ] σ 12 σ 21 σ 22 = σ m1 σ m2 σ ij σ 2m σ mn So we have "covariance matrix" for the vector ε VarCov (ε ) = E[(ε − E (ε ))(ε − E (ε ))' ] = E (εε ' ) 0 Then the assumption (4) is equivalent: Nam T Hoang University of New England - Australia University of Economics - HCMC - Vietnam Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression σ σ2 E (εε ' ) = σ I = 0 2 σ 0 Var (ε i ) = σ i = 1, n (hom oscedasticity ) ⇔ Cov (ε i , ε j ) = ∀i ≠ j (no autocorrelation) Assumption 5: Data generating process for the regressors (Non-stochastic of X) + Xijs are not random variables Notes: This assumption is different with assumption E[ε i X ] = tell about the mean only (has to be 0) Assumption 6: Normality of Errors ε ~ N [0, σ I ] + Normality is not necessary to obtain many results in the regression model + It will be possible to relax this assumption and retain most of the statistic results SUMMARY: The classical linear regression model is: Y = X + ε ε ~ N [0, σ I ] Rank(X) = k X is non-stochastic III LEAST SQUARES ESTIMATION: (Ordinary Least Squares Estimation - OLS) Our first task is to estimate the parameters of the model: Y = X + ε Nam T Hoang University of New England - Australia with ε ~ N [0, σ I ] University of Economics - HCMC - Vietnam Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression Many possible procedures for doing this The choice should be based on "sampling properties" of estimates Let's consider one possible estimation strategy: Least Squares Denote βˆ is estimator of : True relation: Y = X + ε Estimated Relation: Y = X βˆ + e while: e1 e e = is estimated residuals (of ε) or ei is estimated of εi en For the ith observation: Yi = X i′ β + ε i = X i' βˆ + ei unobserved ( population ) observed ( sample ) n Sum of square residuals: ee′ = ∑ ei2 i =1 n ∑e i =1 i = e′e = (Y − Xβˆ )′(Y − Xβˆ ) = Y ′Y − βˆ ′X ′Y − Y ′Xβˆ + βˆ ′X ′Xβˆ = Y ′Y − βˆ ′X ′Y + βˆ ′X ′Xβˆ ( βˆ ′X ′Y = Y ′Xβˆ ) Min(Y ′Y − βˆ ′X ′Y + βˆ ′X ′Xβˆ ) We need decide βˆ satisfies: βˆ The necessary condition for a minimum: ∂[Y ′Y − βˆ ′X ′Y + βˆ ′X ′Xβˆ ] ∂[e′e] =0 =0 ⇔ ∂βˆ ∂βˆ βˆ ′X ′Y = [ βˆ1 βˆ2 Nam T Hoang University of New England - Australia X βˆk ] 12 X 1k 1 X 22 X 32 X 2k X 3k X n X nk Y1 Y 2 Yn University of Economics - HCMC - Vietnam Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression = [ βˆ1 ∑ Yi X Y ∑ i i βˆk ] ∑ X ik Yi βˆ2 Take the derivative w.r.t each βˆ : ∑ Yi ∂[ βˆ ′X ′Y ] ∑ X i 2Yi = = X ′Y ∂βˆ ∑ X ik Yi X ′X X 12 = X 1k 1 X 22 X 32 X 2k X 3k n ∑ X i2 = ∑ X ik ∑X ∑X X n X nk 1 X 12 1 X 22 1 X n ∑X ∑X ∑X i2 2i ∑X ∑X ik X 23 X n3 ∑X ∑X ∑X i3 i2 X 1k X k X nk X 13 i3 ik i2 i2 ∑X ∑X ik i3 ∑X ik ik Symmetric Matrix of sums of squares and cross products: βˆ ′X ′Xβˆ : quadratic form k k βˆ ′X ′Xβˆ = ∑∑ ( X ′X ) ij βˆi βˆ j i =1 j =1 Take the derivatives w.r.t each βˆi : j =i: → j ≠i: ∂[( X ′X ) ij βˆi2 ] = 2( X ′X ) ij βˆi ˆ ∂β i ( X ′X ) ij βˆi βˆ j / ∂βˆ ∂ → 2( X ′X ) ij βˆ j ˆ ˆ ( X ′X ) ji β j β i Nam T Hoang University of New England - Australia j = 1, n University of Economics - HCMC - Vietnam Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression ( X ′X )ij βˆi βˆ j / ∂βˆ → ∂ → 2( X ′X )ij βˆ j ˆ ˆ ( X ′X ) ji β j β i Then So j = 1, n ∂[ βˆ ′( X ′X ) βˆ ] = 2( X ′X ) βˆ ˆ ∂β ∂[e′e] = ⇔ − X ′Y + 2( X ′X ) βˆ = (call "Normal equations") ∂βˆ −1 → ( X ′X ) βˆ = X ′Y → βˆ = ( X ′X ) X ′Y Note: for the existence of ( X ′X ) −1 we need assumption that Rank(X) = k IV ALGEBRAIC PROPERTIES OF LEAST SQUARES: "Orthogonality condition": ⇔ − X ′Y + 2( X ′X ) βˆ = (Normal equations) βˆ ) = ⇔ X ′(Y−X e ⇔ X ′e = X 12 ⇔ X 1k 1 X 22 X 32 X 2k X 3k n ∑ ei = i =1 ⇔ n X e =0 ij i ∑ i =1 X n X nk e1 0 e 0 2 = e n 0 j = 1, n Deviation from mean model (The fitted regression passes through X , Y ) Yi = βˆ1 + βˆ2 X 2i + βˆ3 X 3i + + βˆk X ki + ei i = 1, n Sum overall n observations and divide by n Nam T Hoang University of New England - Australia University of Economics - HCMC - Vietnam Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression n Y = βˆ1 + βˆ2 X + βˆ3 X + + βˆk X k + ∑ ei i =1 Then: Yi − Y = βˆ1 + βˆ2 ( X 2i − X ) + βˆ3 ( X 3i − X ) + + βˆk ( X ki − X k ) + ei i = 1, n In model in deviation form, the intercept is put aside and can be found later The mean of the fitted values Yˆi is equal to the mean of the actual Yi value in the sample: Yi = X i′ β + ε i = X i′βˆ + ei Yˆi → n n i =1 i =1 ∑Yi = ∑Yˆi + n ∑e i =1 i i = 1, n → ∑Y = ∑Yˆ i i → Y = Yˆ Note that: These results used the fact that the regression model include an intercept term V PARTITIONED REGRESSION: FRISH-WAUGH THEOREM: Note: Fundamental idempotent matrix (M): e = Y − X ′βˆ = Y − X ( X ′X ) −1 ( X ′Y ) = [ I − X ( X ′X ) −1 X ′]Y n ×n n ×n = [( Xβ + ε ) − X ( X ′X ) −1 X ′( Xβ + ε )] = [( Xβ + ε ) − Xβ − X ( X ′X ) −1 X ′ε )] = [ I − X ( X ′X ) −1 X ′]ε M ( n ×n ) So residuals vector e has two alternative representations: Nam T Hoang University of New England - Australia University of Economics - HCMC - Vietnam Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression e = MY e = Mε M is the "residual maker" in the regression of Y on X M is symmetric and idempotent, that is: M = M ′ M M = M ′ M ′ = I − X ( X ′X ) −1 X ′ = I − [ X ( X ′X ) −1 X ′]′ = I − X ( X ' X ) −1 X ' =M [ ] Note: ( AB )′ = B ′A′ [ ][ ] M M = I − X ( X ′X ) −1 X ′ I − X ( X ′X ) −1 X ′ = I − X ( X ′X ) −1 X ′ − X ( X ′X ) −1 X ′ + X ( X ′X ) −1 X ′X ( X ′X ) −1 X ′ I = I − X ( X ′X ) −1 X ′ = M Also we have: MX = [ I − X ( X ′X ) −1 X ′] X = X − X ( X ′X ) −1 X ′ X = X − X = 0 n ×k n ×k k ×k k ×n n ×k n ×n Partitioned Regression: Suppose that our matrix of regressors is partitioned into two blocks: X = [ X1 X ] n ×k ( k1 + k = k ) n × k1 n × k Y = X β1 + X β + ε n ×1 n × k1 k1 ×1 n × k k ×1 βˆ Y = [ X X ] + e n ×1 βˆ2 The normal equations: ( X ′X ) βˆ = X ′Y ⇔ [ X X ]′[ X X ]βˆ = [ X X ]′Y Nam T Hoang University of New England - Australia University of Economics - HCMC - Vietnam Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression ⇔ βˆ1 X 1′ X 1′ X ′ [ X X ] ˆ = X ′ Y 2 β ⇔ X 1′ X X ′ X ⇔ ( X 1′ X ) βˆ1 + ( X 1′ X ) βˆ2 = X 1′Y ( X 2′ X ) βˆ1 + ( X 2′ X ) βˆ2 = X 2′Y X 1′ X βˆ1 X 1′Y = X 2′ X βˆ2 X 2′Y (a ) (b) From (a) → ( X 1′ X ) βˆ1 = X 1′ ( − X βˆ2 + Y ) βˆ1 = ( X 1′ X ) −1 X 1′( − X βˆ2 + Y ) or (c) Put (c) into (b): ( X 2′ X )( X 1′ X ) −1 X 1′ ( − X βˆ2 + Y ) + ( X 2′ X ) βˆ2 = X 2′Y ⇔ − X 2′ X ( X 1′ X ) −1 X 1′ X βˆ2 + ( X 2′ X ) βˆ2 = X 2′Y − X 2′ X ( X 1′ X ) −1 X 1′Y ⇔ X 2′ [ I − X ( X 1′ X ) −1 X 1′ ] X βˆ2 = X 2′ [ I − X ( X 1′ X ) −1 X 1′ ]Y M ( n ×n ) We have: ( X 2′ M X ) βˆ2 = X 2′ M 1Y → βˆ2 = ( X 2′ M X ) −1 X 2′ M 1Y M = M ′ M M = M Because Then: M ( n ×n ) ( X 2′ M 1′ M X ) βˆ2 = X 2′ M 1′ M 1Y Y2* X 2* ( X 2* ' X 2* ) βˆ2 = X 2* ' Y * → βˆ2 = ( X 2* ' X 2* ) −1 X 2* ' Y * X 2* = M X → X 2* ' = X 2′ M 1′ Where: * Y = M 1Y Interpretation: • X1 Y * = M 1Y = residuals from regression of Y on n ×1 n × k1 Nam T Hoang University of New England - Australia 10 University of Economics - HCMC - Vietnam Advanced Econometrics • Part I: Basic Econometric Models Chapter 1: Classical Linear Regression X1 X 2* = M X = matrix of residuals from regressions of X2 variables on n × k1 Suppose we regress Y on X1 and get the residuals and also regress X2 (each column of X2) on X1 and get the matrix of the residuals • Regressing Y on X1, the residuals are: e1 = Y − Yˆ = Y − X [( X 1′ X ) −1 X 1′Y ] = M 1Y = Y * n ì1 X ): Regressing X2 (each column of X2 on n × k1 β + ε E = X1 n ×k n ×k n × k1 k1 × k 2 ˆ ) E = ( X − Xˆ ) = ( X − X β n ×k n ×k n ×k n ×k n × k1 k1 × k = [ I − X ( X 1′ X ) −1 X 1′ ] X = M X = X 2* • If we now take these residuals, e1 , and fit a regreesion: now we regress e1 on E: n ×1 ~ e1 = E β +u n ×k n ×1 k ×1 then we will have: ~ β = βˆ2 We get the same results as if we just regress the whole model This results is called the "Frisch - Waugh" theorem Example: Y = Wages X2 = Education (years of schooling) X1 = Ability (test scores) Y = X β1 + X β + ε = effect of one extra year of schooling on wages controlling for ability Nam T Hoang University of New England - Australia 11 University of Economics - HCMC - Vietnam Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression Y* = residuals from regression of Y on X1 (= variation in wages when controlling for ability) X* = residuals from regression of X2 on X1 Then regress Y* on X* → get : Y * = X 2* β + u Example: De-trending, de-seasonaling data: 1 2 t= n Y = t β1 + X β + ε n ×1 n ×1 n ×1 k1 ×1 n × k k ×1 either include "t" in model or "de-trend" X2 & Y variables by regressing on "t" & taking residuals Note: Including trend in regression is an effective way de-trending of data VI GOODNESS OF FIT: One way of measuring the "quality of the fitted regression line" is to measure the extent to which the sample variable for the Y variable is explain by the model - The sample variability of Y is: n (Yi − Y ) ∑ n i =1 n or we could just use: ∑ (Y i =1 - i − Y )2 Our fitted regression: Y = Xβˆ + e = Yˆ + e Yˆ = Xβˆ = X ( X ′X ) −1 X ′Y Note that if the model includes an intercept, then Y = Yˆ Now consider the following matrix: Nam T Hoang University of New England - Australia 12 University of Economics - HCMC - Vietnam Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression ′ 1 M C = I − n ×n n ~ ~ 1 1 where = ~ n ×1 1 Note that: 1 0 M Y = 0 Y1 Y = Yn 0 n1 0 n1 − 1 n1 n1 Y1 n1 Y2 n1 Yn n n n Yn Y1 − Y Yn Y2 − Y = n1 Yn Yn − Y n n Y1 Y1 n Y1 n n We have: • M0 is idempotent • M0 1~ = 0~ • ′M (Y ' M 0Y = Y ′M 0Y = n ∑ (Y i =1 ( M 0Y )' So: i − Y )2 Y = Xβˆ + e = Yˆ + e M 0Y = M Xβˆ + M e = M 0Yˆ + M e Recall that: ( ∑ ei = → M e = e) X ′e = e′X = ~ e′M ' X = e′M X = ~ → Y ′M Y = = ( Xβˆ + e)′ ( M Xβˆ + M e) ( βˆ ' X ′ + e' ) ( M Xβˆ + M e) Nam T Hoang University of New England - Australia 13 University of Economics - HCMC - Vietnam Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression = βˆ ' X ′M Xβˆ + βˆ ' X ′M e + e′M Xβˆ + e′M e = βˆ ' X ′M Xβˆ + e′M e n So: n n (Yi − Y ) = ∑ (Yˆi − Y ) + ∑ ei2 ∑ i =1 i =1 i =1 SST SSR SSE ( Yˆ − Y ; Yˆ = Xβˆ so βˆ ' X ′M Xβˆ = Yˆ ′M 0Yˆ = n ∑ (Yˆ i =1 i − Y )2 ) SST: Total sum of squares SSR: Regression sum of squares SSE: Error sum of squares Coefficient of Determination: R2 = SSR SSE = 1− SST SST (only if intercept included in models) Note: R2 = SSR ≥0 SST R2 = 1− SSE ≤1 SST ⇒ ≤ R2 ≤ What happens if we add any regressor(s) to the model? Y = X β1 + ε (1) Y = X β + X β + u = Xβ + u ( 2) (A) Applying OLS to (2) ˆu u' ( βˆ1 βˆ2 ) (B) Applying OLS to (1) Nam T Hoang University of New England - Australia 14 University of Economics - HCMC - Vietnam Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression e' e ( β1 ) Problem (B) is just problem A subject to the restriction that = The minimized value in (A) must be ≤ that in (B) so uˆ ' u = e' e → Adding any regression(s) to the model cannot increase (typically decrease) the sum of squared residuals so R2 must increase (or at worst stay the same), so R2 is not really a very interesting measure of the quality of regression For this reason, we often use the "Adjusted" R2-Adjusted for "degree of freedom": e′e R = 1 − Y ′M Y e′e /(n − k ) R = 1 − Y ′M Y /(n − 1) Note: e′e = Y ′M Y and rank(M) = (n-k) Y ' M 0Y = n ∑ (Yˆ i =1 i − Y )2 d of freedom = n-1 R may ↑ or ↓ when variables are added It may even be negative Note that: If the model does not include an Intercept, then the equation: SST = SSR + SSE does not hold And we no longer have ≤ R2 ≤ We must also be careful in comparing R2 across different models For example: (1) Cˆ i = 0.5 + 0.8Yi R2 = 0.85 (2) log Ci = 0.2 + 0.7 log Yi + u R2 = 0.7 In (1) R2 relates to sample variation of the variable C In (2), R2 relates to sample variation of the variable log(C) Reading Home: Greene, chapter 3&4 Nam T Hoang University of New England - Australia 15 University of Economics - HCMC - Vietnam ... Econometric Models Chapter 1: Classical Linear Regression Y* = residuals from regression of Y on X1 (= variation in wages when controlling for ability) X* = residuals from regression of X2 on... Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression e = MY e = Mε M is the "residual maker" in the regression of Y on X M is symmetric and idempotent,... Economics - HCMC - Vietnam Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression Assumption 3: Exogeneity of the independent variables E[ε i X j1 X j ,