Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 108 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
108
Dung lượng
639,76 KB
Nội dung
PART ONE Solutions to Exercises Chapter Review of Probability Solutions to Exercises (a) Probability distribution function for Y Outcome (number of heads) probability Y=0 Y=1 Y=2 0.25 0.50 0.25 (b) Cumulative probability distribution function for Y Outcome (number of heads) Probability Y 2000) = − Pr (Y ≤ 2000) ⎛ Y − 1000 2, 000 − 1, 000 ⎞ = − Pr ⎜ ≤ ⎟ 1.9 × 10 ⎠ ⎝ 1.9 × 10 ≈ − Φ (2.2942) = − 0.9891 = 0.0109 > −1.96 and Solutions to Exercises in Chapter 18 91 the R2 of the regression is R2 = − SSR 3.37 =1− = 0.3178 TSS 4.94 (b) When all six assumptions in Key Concept 16.1 hold, we can use the homoskedasticity-only % of the covariance matrix of βˆ , conditional on X, which is estimator ∑ βˆ ⎛ 3.5373 ⎜ −1 % ∑ βˆ = (X′X) suˆ = ⎜ −0.4631 ⎜ −0.0337 ⎝ ⎛ 0.7011 ⎜ = ⎜ −0.09179 ⎜ −0.0067 ⎝ −0.4631 −0.0337 ⎞ ⎟ 0.0684 −0.0080 ⎟ × 0.1982 −0.0080 0.0229 ⎟⎠ −0.09179 −0.0067 ⎞ ⎟ 0.0136 −0.0016 ⎟ −0.0016 0.0045 ⎟⎠ The homoskedasticity-only standard error of βˆ1 is ± ( βˆ ) = 0.0136 12 = 0.1166 SE The t-statistic testing the hypothesis β1 = has a tn–k–1 = t17 distribution under the null hypothesis The value of the t-statistic is t%= βˆ1 0.2520 = = 2.1612, ± ( βˆ ) 0.1166 SE and the 5% two-sided critical value is 2.11 Thus we can reject the null hypothesis β1 = at the 5% significance level (a) Var (Q ) = E[(Q − μQ ) ] = E[(Q − μQ )(Q − μQ )′] = E[(c′W − c′μ W )(c′W − c′μ W )′] = c′E[(W − μ W )( W − μ W )′]c = c′ var( W )c = c′Σ w c where the second equality uses the fact that Q is a scalar and the third equality uses the fact that μQ = c′μw (b) Because the covariance matrix ∑ W is positive definite, we have c′ ∑ w c > for every non-zero vector from the definition Thus, var(Q) > Both the vector c and the matrix ∑ W are finite, so var(Q) = c′ ∑ w c is also finite Thus, < var(Q) < ∞ (a) The regression in the matrix form is Y = Xβ + U 92 Stock/Watson - Introduction to Econometrics - Second Edition with ⎛ Y1 ⎞ ⎜ ⎟ Y Y = ⎜ ⎟, ⎜ M⎟ ⎜⎜ ⎟⎟ ⎝ Yn ⎠ ⎛1 ⎜ X=⎜ ⎜M ⎜⎜ ⎝1 X1 ⎞ ⎟ X2 ⎟ , M⎟ ⎟ Xn ⎟⎠ ⎛ u1 ⎞ ⎜ ⎟ u U ⎜ ⎟, ⎜ M⎟ ⎜⎜ ⎟⎟ ⎝ un ⎠ ⎛ β0 ⎞ ⎟ ⎝ β1 ⎠ β =⎜ (b) Because Xi′ = (1 Xi ), assumptions 1–3 in Key Concept 18.1 follow directly from assumptions 1–3 in Key Concept 4.3 Assumption in Key Concept 18.1 is satisfied since observations Xi (i = 1,2, n) are not constant and there is no perfect multicollinearity among the two vectors of the matrix X (c) Matrix multiplication of X′X and X′Y yields ⎛ n ∑in=1 Xi ⎞ X′X = ⎜ n ⎜ ∑ X ∑ n X ⎟⎟ i =1 i ⎠ ⎝ i =1 i ⎛ ∑in=1 Yi ⎞ ⎛ nY ⎞ X′Y = ⎜ n ⎟= ⎜ ∑ X Y ⎟ ⎜⎝ ∑in=1 Xi Yi ⎟⎠ ⎝ i =1 i i ⎠ The inverse of X′X is ⎛ n (X′X)−1 = ⎜⎜ n ⎝ ∑i =1 Xi ∑in=1 Xi ⎞ ⎟ ∑in=1 Xi2 ⎟⎠ −1 = ⎛ ∑in=1 Xi2 − ∑in=1 Xi ⎞ ⎜ ⎟ ⎟ n ∑in=1 Xi2 − (∑in=1 Xi )2 ⎜⎝ − ∑in=1 Xi n ⎠ = ⎛ ∑in=1 Xi2 /n − X ⎞ ⎜ ⎟ ∑in=1 ( Xi − X )2 ⎝ − X ⎠ The estimator for the coefficient vector is βˆ = ( X′ X)−1 X′Y ⎞ ⎛ ∑in=1 X i2/ n − X ⎞ ⎛ nY ⎜ ⎟ ⎜ ⎟ ∑in=1 ( X i − X ) ⎝ − X ⎠ ⎜⎝ ∑ in=1 X iYi ⎟⎠ ⎛ Y ∑in=1 X i2 − X ∑ in=1 X iYi ⎞ = n ⎜ ⎟ ⎟ ∑ i =1 ( X i − X ) ⎜⎝ ∑in=1 X iYi − nXY ⎠ = Therefore we have βˆ = ∑in=1 XiYi − nXY ∑ in=1 ( Xi − X )(Yi − Y ) = , ∑in=1 ( Xi − X )2 ∑ in=1 ( Xi − X )2 Solutions to Exercises in Chapter 18 and βˆ0 = Y ∑in=1 Xi2 − X ∑in=1 XiYi ∑in=1 ( Xi − X )2 = Y ∑in=1 ( Xi − X + X )2 − X ∑in=1 XiYi ∑in=1 ( Xi − X )2 = Y ∑in=1 ( Xi − X )2 + nX 2Y − X ∑in=1 XiYi ∑in=1 ( Xi − X )2 ⎡ ∑ n X Y − nXY ⎤ X = Y − ⎢ i =n1 i i ⎥ ⎢⎣ ∑i =1 ( Xi − X ) ⎥⎦ = Y − βˆ X We get the same expressions for βˆ0 and βˆ1 as given in Key Concept 4.2 (d) The large-sample covariance matrix of βˆ , conditional on X, converges to Σβˆ = -1 Q X Σ v Q -1 X n with QX = E ( Xi X′i ) and Σ v = E (Vi Vi′ ) = E ( Xi ui ui′Xi ) The column vector Xi for the ith observation is ⎛1⎞ Xi = ⎜ ⎟ , ⎝ Xi ⎠ so we have ⎛1⎞ ⎛1 Xi Xi′ = ⎜ ⎟ (1 Xi ) = ⎜ ⎝ Xi ⎠ ⎝ Xi ⎛ u ⎞ Vi = Xi ui = ⎜ i ⎟ , ⎝ Xi ui ⎠ Xi ⎞ ⎟, Xi2 ⎠ and ⎛ u ⎞ Vi Vi′ = ⎜ i ⎟ ( ui ⎝ Xi ui ⎠ ⎛ u2 Xi ui ) = ⎜ i ⎜X u ⎝ i i Xi ui2 ⎞ ⎟ Xi2 ui2 ⎟⎠ Taking expectations, we get ⎛ QX = E (Xi Xi′ ) = ⎜ ⎝ μX μX ⎞ ⎟, E ( Xi2 ) ⎠ 93 94 Stock/Watson - Introduction to Econometrics - Second Edition and Σ v = E (Vi Vi′ ) ⎛ E (ui2 ) E ( X i ui2 ) ⎞ =⎜ ⎟ ⎜ E ( X u ) E ( X 2u ) ⎟ i i i i ⎝ ⎠ var( u ) cov( X i ui , ui ) ⎞ ⎛ i =⎜ ⎟ var( X i ui ) ⎠ ⎝ cov( X i ui , ui ) In the above equation, the third equality has used the fact that E (ui |Xi ) = so E (ui ) = E[E (ui | Xi )] = 0, E ( Xi ui ) = E[ Xi E (ui | Xi )] = 0, E (ui2 ) = var(ui ) + [ E(ui )]2 = var(ui ) + [E (ui )]2 var(ui ), E ( Xi2ui2 ) = var( Xi ui ) + [E( Xi ui )]2 = var( Xi ui ), E ( Xi2ui2 ) = cov( Xi ui , ui ) + E ( Xi ui )E (ui ) = cov( Xi ui , ui ) The inverse of QX is ⎛ Q−X1 = ⎜ ⎝ μX μX ⎞ ⎟ E ( Xi2 ) ⎠ −1 = E ( Xi ) − μ X2 ⎛ E ( Xi2 ) − μ x ⎞ ⎜⎜ ⎟ ⎟⎠ ⎝ −μ X We now can calculate the large-sample covariance matrix of βˆ , conditional on X, from −1 QX ∑ v QX−1 n = n[E ( Xi ) − μ X2 ]2 ∑ βˆ = ⎛ E ( Xi2 ) − μ X ⎞ ⎛ var(ui ) cov( Xi ui , ui ) ⎞ ⎛ E ( Xi2 ) − μ X ⎞ × ⎜⎜ ⎟ ⎜⎜ ⎟⎟ ⎜ cov( X u , u ) ⎟⎟ var( ) X u − − μ μ 1 i i i i i ⎝ ⎠ X X ⎝ ⎠ ⎝ ⎠ Solutions to Exercises in Chapter 18 The (1, 1) element of ∑ βˆ is n[ EXi2 ) − = = μ 2 X] {[E ( Xi2 )]2 var(ui ) − 2E ( Xi2 )μ X cov( Xi ui , ui ) + μ X2 var( Xi ui )} n[E( Xi2 ) − μ X2 ]2 var[E ( Xi2 ) ui − μ X Xi ui ] ⎡ ⎤ [E ( Xi2 )]2 μX var ⎢ui − Xi ui ⎥ 2 2 n[E( Xi ) − μ X ] E ( Xi ) ⎣⎢ ⎦⎥ ⎡⎛ ⎞ ⎤ μX − var X ⎢ ⎜ ⎟⎟ ui ⎥ i 2 ⎜ ⎡ μ X2 ⎤ ⎣⎢⎝ E ( Xi ) ⎠ ⎦⎥ n ⎢1 − ⎥ ⎣ E ( Xi ) ⎦ var(Hi ui ) = , (the same as the expression for σ β20 given in Key Concept 4.4) 2 n(E (Hi )] by defining = Hi = − μX E ( X i2 ) Xi The denominator in the last equality for the (1, 1) element of ∑ βˆ has used the facts that Hi2 ⎛ ⎞ μ X2 2μ X μX = ⎜⎜ − = + X Xi2 − Xi , ⎟ i 2 ⎟ E ( Xi ) ⎠ E ( Xi ) E ( Xi2 ) ⎝ so E ( H i2 ) = + μ X2 [ E ( X i2 )]2 E ( X i2 ) − 2μ X μ X2 μ = − X E ( X i2 ) E ( X i2 ) PX = X (X′X)−1X′, MX = In − PX (a) PX is idempotent because PXPX = X(X′X)−1 X′X(X′X)−1 X′ = X(X′X)−1X′ = PX MX is idempotent because MX MX = (I n − PX )(I n − PX ) = I n − PX − PX + PX PX = I n − 2PX + PX = I n − PX = MX PXMX = 0nxn because PX M X = PX (I n − PX ) = PX − PX PX = PX − PX = 0n × n (b) Because βˆ = (X′X)−1 X′Y, we have ˆ = Xβˆ = X(X′X)−1 X′Y = P Y Y X 95 96 Stock/Watson - Introduction to Econometrics - Second Edition which is Equation (18.27) The residual vector is ˆ =Y−Y ˆ = Y − P Y = (I − P )Y = M Y U X n X X We know that MXX is orthogonal to the columns of X: MXX = (In − PX) X = X − PXX = X −X (X′ X)−1 X′X =X − X = so the residual vector can be further written as ˆ = M Y = M ( Xβ + U ) = M Xβ + M U = M U U X X X X X which is Equation (18.28) The matrix form for Equation of (10.14) is % %= X %β%+ U Y with ⎛ Y11 − Y1 ⎞ ⎜ ⎟ ⎜ Y12 − Y1 ⎟ ⎜ M ⎟ ⎜ ⎟ ⎜ Y1T − Y1 ⎟ ⎜ Y −Y ⎟ ⎜ 21 ⎟ ⎜ Y22 − Y2 ⎟ %= ⎜ M ⎟ , Y ⎜ ⎟ ⎜ Y2T − Y2 ⎟ ⎜ ⎟ ⎜ M ⎟ ⎜Y −Y ⎟ ⎜ n1 n ⎟ ⎜ Yn − Yn ⎟ ⎜ M ⎟ ⎜ ⎟ ⎜⎜ Y − Y ⎟⎟ n⎠ ⎝ nT ⎛ X11 − X1 ⎞ ⎜ ⎟ ⎜ X12 − X1 ⎟ ⎜ M ⎟ ⎜ ⎟ ⎜ X1T − X1 ⎟ ⎜X −X ⎟ ⎟ ⎜ 21 X − X ⎜ 22 ⎟ ⎜ ⎟ % X=⎜ M ⎟, ⎜ X2T − X2 ⎟ ⎟ ⎜ M ⎟ ⎜ ⎜X −X ⎟ n ⎟ ⎜ n1 − X X ⎜ n2 n ⎟ ⎜ M ⎟ ⎟ ⎜ ⎜⎜ X − X ⎟⎟ n⎠ ⎝ nT ⎛ u11 − u1 ⎞ ⎜ ⎟ ⎜ u12 − u1 ⎟ ⎜ M ⎟ ⎜ ⎟ ⎜ u1T − u1 ⎟ ⎜u −u ⎟ ⎜ 21 ⎟ ⎜ u22 − u2 ⎟ %= ⎜ U M ⎟ ⎜ ⎟ ⎜ u2T − u2 ⎟ ⎜ ⎟ M ⎟ ⎜ ⎜ un1 − un ⎟ ⎜ ⎟ ⎜ un − un ⎟ ⎜ M ⎟ ⎜ ⎟ ⎜⎜ u − u ⎟⎟ n⎠ ⎝ nT β%= β1 The OLS “de-meaning” fixed effects estimator is DM %% %% ′X)−1 X ′Y = (X β% Rewrite Equation (10.11) using n fixed effects as Yit = Xit β1 + D1i γ + D2i γ + L + Dni γ n + uit In matrix form this is YnT × = XnT × β1×1 + WnT× nγ n ×1 + U nT × Solutions to Exercises in Chapter 18 with the subscripts denoting the size of the matrices The matrices for variables and coefficients are ⎛ Y11 ⎞ ⎜ ⎟ ⎜ Y12 ⎟ ⎜ M⎟ ⎜ ⎟ ⎜ Y1T ⎟ ⎜Y ⎟ ⎜ 21 ⎟ ⎜ Y22 ⎟ Y = ⎜ M⎟ , ⎜ ⎟ ⎜ Y2T ⎟ ⎜ ⎟ ⎜ M⎟ ⎜ Yn1 ⎟ ⎜ ⎟ ⎜ Yn ⎟ ⎜ M⎟ ⎜ ⎟ ⎜⎜ Y ⎟⎟ ⎝ nT ⎠ ⎛ X11 ⎞ ⎜ ⎟ ⎜ X12 ⎟ ⎜ M⎟ ⎜ ⎟ ⎜ X1T ⎟ ⎜X ⎟ ⎜ 21 ⎟ ⎜ X22 ⎟ X=⎜ M⎟, ⎜ ⎟ ⎜ X2 T ⎟ ⎜ ⎟ ⎜ M⎟ ⎜ Xn1 ⎟ ⎜ ⎟ ⎜ Xn ⎟ ⎜ M⎟ ⎜ ⎟ ⎜⎜ X ⎟⎟ ⎝ nT ⎠ ⎛ D11 ⎜ ⎜ D11 ⎜ M ⎜ ⎜ D11 ⎜ D1 ⎜ ⎜ D12 W=⎜ M ⎜ ⎜ D12 ⎜ ⎜ M ⎜ D1n ⎜ ⎜ D1n ⎜ M ⎜ ⎜⎜ D1 ⎝ n β = β1 , Dn1 ⎞ ⎟ Dn1 ⎟ M⎟ ⎟ Dn1 ⎟ Dn2 ⎟ ⎟ Dn2 ⎟ ⎟ L M ⎟, L Dn2 ⎟ ⎟ M M⎟ L Dnn ⎟ ⎟ L Dnn ⎟ L M ⎟⎟ L Dnn ⎟⎠⎟ D21 L D21 L M L D21 L D22 L D22 L M D22 M D2n D2n M D2n ⎛ u11 ⎞ ⎜ ⎟ ⎜ u12 ⎟ ⎜ M⎟ ⎜ ⎟ ⎜ u1T ⎟ ⎜u ⎟ ⎜ 21 ⎟ ⎜ u22 ⎟ U=⎜ M⎟, ⎜ ⎟ ⎜ u2T ⎟ ⎜ ⎟ ⎜ M⎟ ⎜ un1 ⎟ ⎜ ⎟ ⎜ un ⎟ ⎜ M⎟ ⎜ ⎟ ⎜⎜ u ⎟⎟ ⎝ nT ⎠ ⎛γ1 ⎞ ⎜ ⎟ γ γ =⎜ ⎟ ⎜ M⎟ ⎜⎜ ⎟⎟ ⎝γ n ⎠ Using the expression for βˆ given in the question, we have the estimator βˆ1BV = βˆ = (X′MW X)−1 X′MW Y = ( (MW X)′(MW X) ) (MW X)′(MW Y) −1 where the second equality uses the fact that MW is idempotent Using the definition of W, ⎛ X1 ⎜ ⎜ X1 ⎜ M M ⎜ ⎜ X1 ⎜ X ⎜ ⎜ X2 PW X = ⎜⎜ M M ⎜ X2 ⎜ ⎜ M M ⎜ 0 ⎜ ⎜ ⎜ M M ⎜ ⎜⎜ 0 ⎝ L L L L L L L L M L L L L ⎞ ⎟ ⎟ M⎟ ⎟ ⎟ ⎟ ⎟ ⎟ M⎟⎟ ⎟ ⎟ M⎟ Xn ⎟ ⎟ Xn ⎟ M⎟ ⎟ X n ⎟⎟⎠ 97 98 Stock/Watson - Introduction to Econometrics - Second Edition and ⎛ X11 − X1 ⎜ ⎜ X12 − X1 ⎜ M ⎜ ⎜ X1T − X1 ⎜ ⎜ ⎜ ⎜ MW X = ⎜ M ⎜ ⎜ M ⎜ ⎜ ⎜ ⎜ ⎜ M ⎜ ⎜⎜ ⎝ 0 M X 21 − X X 22 − X M X 2T − X M 0 M ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ M ⎟⎟ ⎟ L ⎟ M M ⎟ L X n1 − X n ⎟ ⎟ L Xn2 − Xn ⎟ L M ⎟ ⎟ L X nT − X n ⎟⎠⎟ L L L L L L L 0 M % A similar calculation shows M Y = Y % Thus so that M W X = X W −1 %% %% ′X ) X ′Y = βˆ1DN βˆ1BV = ( X (a) We write the regression model, Yi = β1Xi + β2Wi + ui, in the matrix form as Y = Xβ + Wγ + U with ⎛ Y1 ⎞ ⎜ ⎟ Y Y=⎜ ⎟, ⎜ M⎟ ⎜⎜ ⎟⎟ ⎝ Yn ⎠ ⎛ X1 ⎞ ⎜ ⎟ X X=⎜ 2⎟, ⎜ M⎟ ⎜⎜ ⎟⎟ ⎝ Xn ⎠ ⎛ W1 ⎞ ⎜ ⎟ W W=⎜ ⎟, ⎜ M⎟ ⎜⎜ ⎟⎟ ⎝ Wn ⎠ β = β1 , ⎛ u1 ⎞ ⎜ ⎟ u U =⎜ ⎟, ⎜ M⎟ ⎜⎜ ⎟⎟ ⎝ un ⎠ γ = β2 The OLS estimator is ⎛ βˆ1 ⎞ ⎛ X′X X′W ⎞ −1 ⎛ X′Y ⎞ ⎜ ⎟=⎜ ⎜ βˆ ⎟ ⎝ W′X W′W ⎟⎠ ⎜⎝ W′Y ⎟⎠ ⎝ 2⎠ −1 ⎛ β1 ⎞ ⎛ X′X X′W ⎞ ⎛ X′U ⎞ =⎜ ⎟+⎜ ⎟ ⎜ ⎟ ⎝ β ⎠ ⎝ W′X W′W ⎠ ⎝ W′U ⎠ ⎛ β1 ⎞ ⎛ X′X = ⎜ ⎟ + ⎜ 1n ⎝ β ⎠ ⎝ n W′X n n ⎛ β1 ⎞ ⎛ ∑ n X = ⎜ ⎟ + ⎜⎜ n n i =1 i ⎝ β ⎠ ⎝ n ∑ i =1 Wi Xi −1 X′W ⎞ ⎛ 1n X′U ⎞ ⎟ ⎜ ⎟ W′W ⎠ ⎝ 1n W′U ⎠ ∑ in=1 XiWi ⎞ ⎟ ∑ in=1 Wi ⎟⎠ n n −1 ⎛ 1n ∑in=1 Xi ui ⎞ ⎜ n ⎟ ⎜ ∑ Wu ⎟ ⎝ n i =1 i i ⎠ Solutions to Exercises in Chapter 18 By the law of large numbers n p p ∑in=1 Xi2 → E( X ); n1 ∑in=1 Wi → E (W ); (because X and W are independent with means of zero); u are independent with means of zero); n n n p ∑in=1 XiWi → E ( XW ) = p ∑in=1 Xi ui → E ( Xu) = (because X and p ∑in=1 X i ui → E ( Xu ) = Thus −1 ⎛ βˆ1 ⎞ p ⎛ β1 ⎞ ⎛ E ( X ) ⎞ ⎛ ⎞ ⎜ ⎟→ ⎜ ⎟+⎜ ⎟ ⎜ ⎟ ⎜ ˆ ⎟ E (W ) ⎠ ⎝ E(Wu) ⎠ ⎝ β2 ⎠ ⎝ ⎝ β2 ⎠ β1 ⎛ ⎞ =⎜ E (Wu ) ⎟ ⎜ β2 + ⎟ E (W ) ⎠ ⎝ p (b) From the answer to (a) βˆ2 → β + EE((WW 2u)) ≠ β if E(Wu) is nonzero (c) Consider the population linear regression ui onto Wi: ui = λWi + where λ = E(Wu)/E(W2) In this population regression, by construction, E(aW) = Using this equation for ui rewrite the equation to be estimated as Yi = X i β1 + Wi β + ui = X i β1 + Wi ( β + λ ) + = X i β1 + Wiθ + where θ = β + λ A calculation like that used in part (a) can be used to show that ⎛ n ( βˆ1 − β1 ) ⎞ ⎛ ∑in=1 Xi2 ⎜ ⎟= n ⎜ n ( βˆ − θ ) ⎟ ⎝⎜⎜ 1n ∑in=1 Wi Xi ⎝ ⎠ ∑in=1 XiWi ⎞ ⎟ ∑ in=1 Wi ⎠⎟ n n −1 ⎛ ⎜ ⎜ ⎝ ∑in=1 Xi ⎞ ⎟ n ⎟ ∑ W a = i u i n ⎠ n −1 ⎛ E( X ) ⎞ ⎛ S1 ⎞ → ⎜ ⎟ ⎜ ⎟ E (W ) ⎠ ⎝ S2 ⎠ ⎝ d where S1 is distributed N (0, σ a2 E ( X )) Thus by Slutsky’s theorem ⎛ σ a2 ⎞ d n ( βˆ1 − β1 ) → N ⎜ 0, ⎟ ⎝ E( X ) ⎠ Now consider the regression that omits W, which can be written as: Yi = X i β1 + di where di = Wiθ + Calculations like those used above imply that ⎛ σ d2 ⎞ d n βˆ1r − β1 → N ⎜ 0, ⎟ ⎝ E( X ) ⎠ ( 99 ) Since σ d2 = σ a2 + θ E (W ), the asymptotic variance of βˆ1r is never smaller than the asymptotic variance of βˆ1 100 Stock/Watson - Introduction to Econometrics - Second Edition % (a) The regression errors satisfy u1 = u% and ui = 0.5u i −1 + ui for i = 2, 3,…, n with the random variables u%i (i = 1, 2, K , n) being i.i.d with mean and variance For i > 1, continuing substituting ui – j = 0.5ui – j – + u%i − j ( j = 1, 2,…, i − 2) and u1 = u% into the expression ui = 0.5ui – + u%i yields ui = 0.5ui −1 + u% i % = 0.5(0.5ui − + u% i −1 ) + ui % % = 0.52 (0.5ui − + u% i − ) + 0.5ui −1 + ui % % % = 0.53 (0.5ui − + u% i − ) + 0.5 ui − + 0.5ui −1 + ui =L L i −2 i −3 % % % = 0.5i −1 u% u% u% + 0.5 + 0.5 + L + 0.5 ui − + 0.5ui −1 + ui i = ∑ 0.5i − j u%j j =1 Though we get the expression ui = ∑ij =1 0.5i − j u%j for i > 1, it is apparent that it also holds for i = Thus we can get mean and variance of random variables ui (i = 1, 2,…, n): i E(ui ) = ∑ 0.5i − j E(u%j ) = 0, j =1 i i j =1 j =1 σ i2 = var(ui ) = ∑ (0.5i − j )2 var(u%j ) =∑ (0.52 )i − j × = − (0.52 )i − 0.52 In calculating the variance, the second equality has used the fact that u%i is i.i.d Since ui = ∑ij =1 0.5i − j u%j we know for k > 0, i+k i j =1 j =1 ui + k = ∑ 0.5i + k − j u%j = 0.5k ∑ 0.5i − j u%j + = 0.5k ui + i+k ∑ 0.5 j = i +1 i+k − j i+k ∑ 0.5 i+k − j j = i +1 u%j u%j Because u%i is i.i.d., the covariance between random variables ui and ui + k is i+k ⎛ ⎞ cov(ui , ui + k ) = cov ⎜ ui , 0.5k ui + ∑ 0.5i + k − j u%j ⎟ j = i +1 ⎝ ⎠ k = 0.5 σ i Similarly we can get cov(ui , ui − k ) = 0.5k σ i2− k Solutions to Exercises in Chapter 18 101 The column vector U for the regression error is ⎛ u1 ⎞ ⎜ ⎟ u U =⎜ ⎟ ⎜ M⎟ ⎜⎜ u ⎟⎟ ⎝ n⎠ It is straightforward to get ⎛ E (u12 ) E (u1u2 ) ⎜ E (u2 u1 ) E (u22 ) E (UU′) = ⎜ ⎜ M M ⎜⎜ ⎝ E (un u1 ) E (un u2 ) E (u1un ) ⎞ ⎟ E (u2 un ) ⎟ M ⎟ ⎟ E (un2 ) ⎟⎠ L L O L ( ) Because E(ui) = 0, we have E ui2 = var(ui) and E(uiuj) = cov(ui, uj) Substituting in the results on variances and covariances, we have ⎛ σ 12 ⎜ ⎜ 0.5σ ⎜ 0.52 σ Ω = E ( UU′) = ⎜ ⎜ 0.5 σ ⎜ M ⎜ n −1 ⎜ 0.5 σ ⎝ 0.5σ 12 σ 22 0.5σ 22 0.52 σ 22 0.52 σ 12 0.5σ 22 σ 32 0.5σ 32 M 0.5 n−2 σ 0.53 σ 12 0.52 σ 22 0.5σ 32 σ 42 M 2 0.5 n −3 σ M 0.5 n−4 σ 42 L L L L O L 0.5n −1σ 12 ⎞ ⎟ 0.5n − σ 22 ⎟ 0.5n −3 σ 32 ⎟ ⎟ 0.5n − σ 42 ⎟ M ⎟ ⎟ σ n2 ⎠⎟ − (0.52 )i − 0.52 (b) The original regression model is with σ i2 = Yi = β + β1 X i + ui Lagging each side of the regression equation and substracting 0.5 times this lag from each side gives Yi − 0.5Yi −1 = 0.5β + β1 ( Xi − 0.5 Xi −1 ) + ui − 0.5ui −1 for i = 2,…, n with ui − 0.5ui −1 = u%i Also Y1 = β + β1 X1 + u1 with u1 = u% Thus we can define a pair of new variables % % (Y% i , X1i , X2 i ) = (Yi − 0.5Yi −1 , 0.5, Xi − 0.5 Xi −1 ), % % for i = 2,…, n and (Y% , X11 , X21 ) = (Y1 , 1, X1i ), and estimate the regression equation % % % Y% i = β X1i + β1 X i + ui using data for i = 1,…, n The regression error u%i is i.i.d and distributed independently of X%i , thus the new regression model can be estimated directly by the OLS 102 Stock/Watson - Introduction to Econometrics - Second Edition (a) βˆ = (X′MW X)−1 X′MW Y = (X′MW X)−1 X′MW (Xβ + Wγ + U) = β + (X′MW X)−1 X′MW U The last equality has used the orthogonality MWW = Thus βˆ − β = (X′M W X)−1 X′M W U = (n −1X′M W X)−1 (n −1X′M W U) (b) Using MW = In − PW and PW = W(W′W)−1W′ we can get n −1X′MW X = n −1X′(I n − PW )X = n −1X′X − n −1X′PW X = n −1X′X − (n −1X′W)(n −1W′W)−1 (n −1W′X) First consider n −1X′X = 1n ∑in=1 Xi X′i The (j, l) element of this matrix is n1 ∑in=1 X ji Xli By Assumption (ii), Xi is i.i.d., so XjiXli is i.i.d By Assumption (iii) each element of Xi has four moments, so by the Cauchy-Schwarz inequality XjiXli has two moments: E ( X 2ji Xli2 ) ≤ E ( X 4ji ) ⋅ E ( Xli4 ) < ∞ Because XjiXli is i.i.d with two moments, n ∑in=1 X ji Xli obeys the law of large numbers, so n p X ji Xli → E ( X ji Xli ) ∑ n i =1 This is true for all the elements of n−1 X′X, so n −1X′X = n ∑ Xi X′i →p E(Xi X′i ) = ∑ XX n i =1 Applying the same reasoning and using Assumption (ii) that (Xi, Wi, Yi) are i.i.d and Assumption (iii) that (Xi, Wi, ui) have four moments, we have n −1W′W = n ∑ Wi Wi′ →p E(Wi Wi′ ) = ∑ WW , n i =1 n −1X′W = n ∑ Xi Wi′ →p E(Xi Wi′ ) = ∑XW , n i =1 n −1W′X = n p Wi X′i → E (Wi X′i ) = ∑ WX ∑ n i =1 and Solutions to Exercises in Chapter 18 103 From Assumption (iii) we know Σ XX , Σ WW , Σ XW , and Σ WX are all finite non-zero, Slutsky’s theorem implies n −1 X′M W X = n −1X′X − (n −1X′W ) (n −1W′W ) −1 (n −1W′X) → Σ XX − Σ XW Σ -1WW Σ WX p which is finite and invertible (c) The conditional expectation ⎛ E(u1|X, W) ⎞ ⎛ E(u1|X1 , W1 ) ⎞ ⎜ ⎟ ⎜ ⎟ E (u2 |X, W) ⎟ ⎜ E(u2 |X2 , W2 ) ⎟ ⎜ = E (U|X, W) = ⎜ ⎟ ⎜ ⎟ M M ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ ⎝ E(un |X, W) ⎠ ⎝ E (un |Xn , Wn ) ⎠ ⎛ W1′δ ⎞ ⎛ W1′ ⎞ ⎜ ′ ⎟ ⎜ ′⎟ Wδ W = ⎜ ⎟ = ⎜ ⎟ δ = Wδ ⎜ M ⎟ ⎜ M⎟ ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ ⎝ Wn′δ ⎠ ⎝ Wn′ ⎠ The second equality used Assumption (ii) that (X i , Wi , Yi ) are i.i.d., and the third equality applied the conditional mean independence assumption (i) (d) In the limit p n −1X′MW U → E(X′MW U|X, W) = X′MW E(U|X, W) = X′MW Wδ = k1 × because M W W = (e) n −1X′M W X converges in probability to a finite invertible matrix, and n −1X′M W U converges in probability to a zero vector Applying Slutsky’s theorem, p βˆ − β = (n −1X′MW X)-1 (n −1X′MW U) → This implies p βˆ → β 10 (a) Using the hint: Cq = λq, so that = Cq − λq = CCq − λq = λCq − λq = λ2q − λq = λ(1 − λ)q, and the result follows by inspection (b) The trace of a matrix is equal to sum of its eigenvalues The rank of a matrix is equal to the number of non-zero eigenvalues Thus, the result follows from (a) (c) Because C is symmetric with non-negative eigenvalues, C is positive semidefinite, and the result follows 11 (a) Using the hint C = [Q1 Q2] ⎢ ⎡ Ir ⎣0 ⎤ ⎡ Q1 ' ⎤ , where Q′Q = I The result follows with A=Q1 ⎥⎦ ⎢⎣Q2 '⎥⎦ (b) W = A′V ~ N(A′0, A′InA) and the result follows immediately (c) V′CV = V′AA′V = (A′V)′(A′V) = W’W and the result follows from (b) 104 Stock/Watson - Introduction to Econometrics - Second Edition 12 (a) and (b) These mimic the steps using TSLS 13 (a) This follows from the definition of the Lagrangian (b) The first order conditions are (*) X′(Y−X β%) + R′λ = and (**) R β%− r = Solving (*) yields (***) β%= βˆ + (X′X)–1R′λ Multiplying by R and using (**) yields r = R βˆ +R(X′X)–1R′λ, so that λ = −[R(X′X)–1R′]–1(R βˆ − r) Substituting this into (***) yields the result (c) Using the result in (b), Y − X β%= (Y − X βˆ ) + X(X′X)–1R′[ R(X′X)–1R′]–1(R βˆ − r), so that (Y − X β%)′(Y − X β%) = (Y − X βˆ )′(Y − X βˆ ) + (R βˆ − r ) ′[R(X′X)–1R′]–1(R βˆ − r) + 2(Y − X βˆ )′ X(X′X)–1R′[R(X′X)–1R′]–1(R βˆ − r) But (Y − X βˆ )′ X = 0, so the last term vanishes, and the result follows (d) The result in (c) shows that (R βˆ − r)′[R(X′X)–1R′]–1(R βˆ − r) = SSRRestricted − SSRUnrestricted Also su2 = SSRUnrestricted /(n − kUnrestricted – 1), and the result follows immediately 14 (a) βˆ ′(X′X) βˆ = Y′X(X′X)–1X′Y = Y′X1HX1′Y, where H is the upper k1 × k1 block of (X′X)–1 Also R βˆ = HX1′Y and R(X′X)–1R′ = H Thus ( Rβˆ )'[ R( X' X )−1 R]−1( Rβˆ ) = Y′X1HX1′Y (b) (i) Write the second stage regression as Y = Xˆ β + U , where Xˆ and the fitted values from the ˆ ˆ = 0, where Uˆ = Y − Xˆ βˆ because OLS residual are first stage regression Note that U'X orthogonal to the regressors Now Uˆ TSLS =Y − X βˆ = Uˆ − ( X − Xˆ ) βˆ = Uˆ − Vˆ βˆ , where Vˆ is the residual from the first stage regression But, since W is a regressor in the first stage regression, Vˆ ' W = Thus Uˆ TSLS ' W = Uˆ ' W − βˆ ' Vˆ ' W = (ii) βˆ ′(X′X) βˆ = ( Rβˆ )'[ R( X' X )-1 R]-1 ( Rβˆ) = SSRRest − SSRUnrest for the regression in KC 12.6, and the result follows directly Solutions to Exercises in Chapter 18 15 105 (a) This follows from exercise (18.6) % % (b) Y% i = X i β + ui , so that ⎛ n ⎞ βˆ − β = ⎜ ∑ X%i ' X%i ⎟ ⎝ i =1 ⎠ −1 n ⎛ n ⎞ = ⎜ ∑ X%i ' X%ii ⎟ ⎝ i =1 ⎠ ∑ X%' u% i i =1 i −1 n ∑ X ' M ' Mu i i =1 ⎛ n ⎞ = ⎜ ∑ X%ii ' X%ii ⎟ ⎝ i =1 ⎠ −1 n ⎛ n ⎞ = ⎜ ∑ X%ii ' X%ii ⎟ ⎝ i =1 ⎠ −1 n i ∑ X ' M 'u i i =1 i ∑ X%' u ii i =1 i (c) Note Typo in problem: Should Read: QX% = T −1 E( X%i ' X%i ) = T −1 ∑Tt =1 E( Xit − Xi )2 Qˆ X% = 1n ∑in=1 (T −1 ∑Tt =1 ( Xit − Xi )2 ), where (T −1 ∑Tt=1 ( Xit − Xi )2 ) are i.i.d with mean Q X% and finite variance (because Xit has finite fourth moments) The result then follows from the law of large numbers (d) This follows the the Central limit theorem (e) This follows from Slutsky’s theorem (f) ηi2 are i.i.d., and the result follows from the law of large numbers −1/ ˆ ( β − β ) X%i ' X%i Then (g) Let ηˆi = T −1/ X%i ' uˆ% i = ηi − T ˆ = η + T −1 ( βˆ − β ) ( X% ' X%) − 2T −1/ ( βˆ − β )η X% ' X% ηˆi2 = T −1/ X%i ' u% i i i i i i i and n ∑in=1 ηˆi2 − ∑in=1 ηi2 = T −1 ( βˆ − β ) 1n ∑in=1 ( X%i ' X%i ) − 2T −1/ ( βˆ − β ) 1n ∑in=1 ηi X%i ' X%i n Because ( βˆ − β ) → , the result follows from (a) p n p ∑in=1 ( X%i ' X%i )2 → E[( X%i ' X%i )2 ] and (b) ∑in=1 ηi X%i ' X%i → E (ηi X%i ' X%i ) Both (a) and (b) follow from the law of large numbers; both (a) and (b) are averages of i.i.d random variables Completing the proof requires verifying that ( X%i ' X%i ) has two finite moments and ηi X%i ' X%i has two finite moments These in turn follow from 8-moment assumptions for (Xit, uit) and the Cauchy-Schwartz inequality Alternatively, a “strong” law of large numbers can be used to show the result with finite fourth moments n p