1. Trang chủ
  2. » Tài Chính - Ngân Hàng

Class Notes in Statistics and Econometrics Part 29 ppsx

53 160 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

CHAPTER 57 Applications of GLS with Nonspherical Covariance Matrix In most cases in which the covariance matrix is nonspherical, Ψ contains un- known parameters, which must be estimated before formula (26.0.2) can be applied. Of course, if all entries of Ψ are unknown, such estimation is impossible, since one needs n(n + 1)/2 − 1 parameters to sp ecify a symmetric matrix up to a multiplica- tive factor, but with n observations only n unrelated parameters can be estimated consistently. Only in a few exceptional cases, Ψ is known, and in some even more exceptional cases, there are unknown parameters in Ψ but (26.0.2) doe s not depend on them. We will discuss such examples first: heteroskedastic disturbances with known relative variances, and some examples involving equicorrelated disturbances. 1233 1234 57. APPLICATIONS WITH NONSPHERICAL COVARIANCE 57.1. Cases when OLS and GLS are identical Problem 498. From y = Xβ + ε ε ε with ε ε ε ∼ (o, σ 2 I) follows P y = P Xβ + Pε ε ε with P ε ε ε ∼ (o, σ 2 P P  ). Which conditions must P satisfy so that the generalized least squares regression of P y on P X with covariance matrix P P  gives the same result as the original regression? Problem 499. We are in the model y = Xβ + ε ε ε, ε ε ε ∼ σ 2 Ψ. As always, we assume X has full column rank, and Ψ is nonsingular. We will discuss the special situation here in which X and Ψ are such that ΨX = XA for some A. • a. 3 points Show that the requirement ΨX = XA is equivalent to the requirement that R [ΨX] = R [X]. Here R [B] is the range space of a matrix B, i.e., it is the vector space consisting of all vectors that can be written in the form Bc for some c. Hint: For ⇒ show first that R [ΨX] ⊂ R [X], and then show that R [ΨX] has the same dimension as R [X]. Answer. ⇒: Clearly R [ΨX] ⊂ R [X] since ΨX = XA and every XAc has the form Xd with d = Ac. And since Ψ is nonsingular, and the range space is the space spanned by the column vectors, and the columns of ΨX are the columns of X premultiplied by Ψ, it follows that the range space of ΨX has the same dimension as that of X. ⇐: The ith column of ΨX lies in R [X], i.e., it can be written in the form Xa i for some a i . A is the matrix whose columns are all the a i .  • b. 2 points Show that A is nonsingular. 57.2. HETEROSKEDASTIC DISTURBANC ES 1235 Answer. A is square, since XA = ΨX, i.e., XA has has as many columns as X. Now assume Ac = o. Then XAc = o or ΨXc = o, and since Ψ is nonsingular this gives Xc = o, and since X has full column rank, this gives c = o.  • c. 2 points Show that XA −1 = Ψ −1 X. Answer. X = Ψ −1 ΨX = Ψ −1 XA, a nd now postmultiply by A −1 .  • d. 2 points Show that in this case (X  Ψ −1 X) −1 X  Ψ −1 = (X  X) −1 X  , i.e., the OLS is BLUE (“Kruskal’s theorem”). Answer. (X  Ψ −1 X) −1 X  Ψ −1 =  (A −1 )  X  X  −1 (A −1 )  X  = (X  X) −1 A  (A −1 )  X  = (X  X) −1 X   57.2. Heteroskedastic Disturbances Heteroskedasticity means: error terms are independent, but their variances are not equal. Ψ is diagonal, with positive diagonal elements. In a few rare cases the relative variances are known. The main example is that the observations are means of samples from a homoskedastic population with varying but known sizes. This is a plausible example of a situation in which the relative variances are known to be proportional to an observed (positive) nonrandom variable z (which may 1236 57. APPLICATIONS WITH NONSPHERICAL COVARIANCE or may not be one of the explanatory variables in the regression). Here V [ε ε ε] = σ 2 Ψ with a known diagonal (57.2.1) Ψ =      z 1 0 ··· 0 0 z 2 ··· 0 . . . . . . . . . . . . 0 0 ··· z n      . Therefore P =      1/ √ z 1 0 ··· 0 0 1/ √ z 2 ··· 0 . . . . . . . . . . . . 0 0 ··· 1/ √ z n      , i.e., one divides every observation by the appropriate factor so that after the division the standard deviations are equal. Note: this means that this transformed regression usually no longer has a constant term, and therefore also R 2 loses its meaning. Problem 500. 3 points The specificat ion is (57.2.2) y t = β 1 + β 2 x t + β 3 x 2 t + ε t , with E[ε t ] = 0, var[ε t ] = σ 2 x 2 t for some unknown σ 2 > 0, and the errors are uncor- related. Someone runs the OLS regression (57.2.3) y t x t = γ 1 + γ 2 1 x t + γ 3 x t + v t and you have the estimates ˆγ 1 , ˆγ 2 , and ˆγ 3 from this regression. Compute estimates of β 1 , β 2 , and β 3 using the ˆγ i . What properties do your estimates of the β i have? 57.2. HETEROSKEDASTIC DISTURBANC ES 1237 Answer. Divide the original specification by x t to get (57.2.4) y t x t = β 2 + β 1 1 x t + β 3 x t + ε t x t . Therefore ˆγ 2 is the BLUE of β 1 , ˆγ 1 that of β 2 , and ˆγ 3 that of β 3 . Note that the constant terms of the old and new regression switch places!  Now let us look at a random parameter model y t = x t γ t , or in vector notation, using ∗ for element-by-element multiplication of two vectors, y = x ∗ γ. Here γ t ∼ IID(β, σ 2 ), one can also write it γ t = β + δ t or γ = ιβ + δ with δ ∼ (o, σ 2 I). This model can be converted into a heteroskedastic Least Squares model if one defines ε ε ε = x ∗ δ. Then y = xβ + ε ε ε with ε ε ε ∼ (o, σ 2 Ψ) where (57.2.5) Ψ =      x 2 1 0 ··· 0 0 x 2 2 ··· 0 . . . . . . . . . . . . 0 0 ··· x 2 n      . Since x  Ψ −1 = x −1  (taking the inverse element by element), and therefore x  Ψ −1 x = n, one gets ˆ β = 1 n  y t x t and var[ ˆ β] = σ 2 /n. On the other hand, x  Ψx =  x 4 , therefore var[ ˆ β OLS ] = σ 2  x 4 (  x 2 ) 2 . Assuming that the x t are independent drawings 1238 57. APPLICATIONS WITH NONSPHERICAL COVARIANCE of a random variable x with zero mean and finite fourth moments, it follows (57.2.6) plim var[ ˆ β OLS ] var[ ˆ β] = plim n  x 4 (  x 2 ) 2 = plim 1 n  x 4 (plim 1 n  x 2 ) 2 = E[x 4 ] (E[x 2 ]) 2 This is the kurtosis (without subtracting the 3). Theoretically it can be anything ≥ 1, the Normal distribution has kurtosis 3, and the economics time series usually have a kurtosis between 2 and 4. 57.3. Equicorrelated Covariance Matrix Problem 501. Assume y i = µ + ε i , where µ is nonrandom, E[ε i ] = 0, var[ε i ] = σ 2 , and cov[ε i , ε j ] = ρσ 2 for i = j (i.e., the ε i are equicorrelated). (57.3.1) V [ε ε ε] = σ 2      1 ρ ··· ρ ρ 1 ··· ρ . . . . . . . . . . . . ρ ρ ··· 1      . If ρ ≥ 0, then these error terms could have been obtained as follows: ε ε ε = z + ιu where z ∼ (o, τ 2 I) and u ∼ (0, ω 2 ) independent of z. • a. 1 point Show that the covariance matrix of ε ε ε is V [ε ε ε] = τ 2 I + ω 2 ιι  . 57.3. EQUICORRELAT ED COVARIANCE MATRIX 1239 Answer. V [ιu] = ι var[u]ι  , add this to V [z].  • b. 1 point What are the values of τ 2 and ω 2 so that ε ε ε has the above covariance structure? Answer. To write it in the desired form, the following identities must hold: for the off- diagonal elements σ 2 ρ = ω 2 , which g ives the desired formula for ω 2 and for the diagonal elements σ 2 = τ 2 +ω 2 . Solving this for τ 2 and plugging in the formula for ω 2 gives τ 2 = σ 2 −ω 2 = σ 2 (1−ρ),  • c. 3 points Using matrix identity (A.8.20) (for ordinary inverses, not for g- inverses) show that the generalized least squares formula for the BLUE in this model is equivalent to the ordinary least squares formula. In other words, show that the sample mean ¯y is the BLUE of µ. Answer. Setting γ = τ 2 /σ 2 , we want to show that  ι  (I + ιι  γ ) −1 ι  −1 ι  (I + ιι  γ ) −1 y =  ι  I −1 ι  −1 ι  I −1 y.(57.3.2) This is even true for arbitrary h and A: h  (A + hh  γ ) −1 = h  A −1 γ γ + h  A −1 h ;(57.3.3)  h  (A + hh  γ ) −1 h  −1 = γ + h  A −1 h γh  A −1 h = 1 h  A −1 h + 1 γ ;(57.3.4) 1240 57. APPLICATIONS WITH NONSPHERICAL COVARIANCE Now multiply the left sides and the righthand sides (use middle term in (57.3.4))  h  (A + hh  γ ) −1 h  −1 h  (A + hh  γ ) −1 =  h  A −1 h  −1 h  A −1 .(57.3.5)  • d. 3 points [Gre97, Example 11.1 on pp. 499/500]: Show that var[¯y] does not converge to zero as n → ∞ while ρ remains constant. Answer. By (57.3.4), (57.3.6) var[¯y] = τ 2 ( 1 n + 1 γ ) = σ 2 ( 1 −ρ n + ρ) = τ 2 n + ω 2 As n → ∞ this converges towards ω 2 , not to 0.  Problem 502. [Chr87, pp. 361–363] Assume there are 1000 families in a certain town, and denote the income of family k by z k . Let µ = 1 1000  1000 k=1 z k be the population average of all 1000 incomes in this fin ite population, and let σ 2 = 1 1000  1000 k=1 (z k − µ) 2 be the population variance of the incomes. For the pur- poses of this question, the z k are nonrandom, therefore µ and σ 2 are nonrandom as well. You pick at random 2 0 families without replacement, ask them what their income is, and you want to compute the BLUE of µ on the basis of this random sample. Call 57.3. EQUICORRELAT ED COVARIANCE MATRIX 1241 the incomes in the sample y 1 , . . . , y 20 . We are using the letters y i instead of z i for this sample, because y 1 is not necessarily z 1 , i.e., the income of family 1, but it may be, e.g., z 258 . The y i are random. The process of taking the sample of y i is represented by a 20 × 1000 matrix of random variables q ik (i = 1, . . . ,20, k = 1, . . . , 1000) with: q ik = 1 if family k has been picked as ith family in the sample, and 0 otherwise. In other words, y i =  1000 k=1 q ik z k or y = Qz. • a. Let i = j and k = l. Is q ik independent of q il ? Is q ik independent of q jk ? Is q ik independent of q jl ? Answer. q ik is not independent of q il : if q ik = 1, this means that family ik as been selected as the jth family in the sample. Sin ce only one family can be selected as the ith family in the sample, this implies q il = 0 for all l = k. q ik is dependent of q jk , because sampling is without replac eme nt: if family k has been selected as the ith family in the sample, then it cannot be selected again as the jth family of the sample. Is q ik independ ent of q jl ? I think it is.  • b. Show that the first and second moments are (57.3.7) E[q ik ] = 1/1000, and E[q ik q jl ] =      1/1000 if i = j and k = l 1/(1000 ·999) if i = j and k = l 0 otherwise. 1242 57. APPLICATIONS WITH NONSPHERICAL COVARIANCE For these formulas you need the rules how to take expected values of discrete random variables. Answer. Since q ik is a zero-one variable, E[ q ik ] = Pr[ q ik = 1] = 1/1000. This is obvious if i = 1, and one can use a symmetry argument that it should not depend on i. And since for a zero- one variable, q 2 ik = q ik , it follows E[q 2 ik ] = 1/1000 too. Now for i = j, k = l, E[q ik q jl ] = Pr[q ik = 1 ∩q jl = 1] = (1/1000)(1/999). Again this is obvious for i = 1 and j = 2, and can be extended by symmetry to arbitrary pairs i = j. For i = j, E[q ik q jk ] = 0 since z k cannot be chosen twice, and for k = l, E[q ik q il ] = 0 since only one z k can be chosen as the ith element in the sample.  • c. Since  1000 k=1 q ik = 1 for all i, one can write (57.3.8) y i = µ + 1000  k=1 q ik (z k − µ) = µ + ε i where ε i =  1000 k=1 q ik (z k − µ). Show that (57.3.9) E[ε i ] = 0 var[ε i ] = σ 2 cov[ε i , ε j ] = −σ 2 /999 for i = j Hint: For the covariance note that from 0 =  1000 k=1 (z k − µ) follows (57.3.10) 0 = 1000  k=1 (z k −µ) 1000  l=1 (z l −µ) =  k=l (z k −µ)(z l −µ)+ 1000  k=1 (z k −µ) 2 =  k=l (z k −µ)(z l −µ)+1000σ 2 . [...]... t independent of εs−1 and since t > s, i.e., t = s, v t is also independent of v s , therefore v t independent of εs = ρεs−1 + v s Now independence of v t of y s : By induction assumption, v t independent of y s−1 and since t > s, v t is also independent of εs , therefore v t independent of y s = α + βy s−1 + εs σ2 v • b 3 points Show that var[εt ] = ρ2t var[ε0 ] + (1 − ρ2t ) 1−ρ2 (Hint: use induc-... UNKNOWN PARAMETERS IN THE COVARIANCE MATRIX Problem 506 [JHG+ 88, p 577] and [Gre97, 13.4.1] Assume y t = α + βy t−1 + εt (58.2.1) (58.2.2) where v t ∼ |β| < 1 εt = ρεt−1 + v t 2 IID(0, σv ) and all v t are independent of ε0 and y 0 , and |ρ| < 1 and • a 2 points Show that v t is independent of all εs and y s for 0 ≤ s < t Answer Both proofs by induction First independence of v t of εs : By induction assumption,... v t ∼ (0, σv ) and v s independent of v t for s = t The first disturbance ε1 has a finite variance and is independent of v 2 , , v n • a 1 point Show by induction that v t is independent of all εs with 1 ≤ s < t ≤ n 58.2 AUTOCORRELATION 1265 Answer v t (2 ≤ t ≤ n) is independent of ε1 by assumption Now assume 2 ≤ s ≤ t − 1 and ε v t is independent of εs−1 Since εs = ρε s−1 + v s , and v t is by... This question is formulated in such a way that you can do each part of it independently of the others Therefore if you get stuck, just go on to the next part We are working in the linear regression model y t = xt β + εt , t = 1, , n, in which the following is known about the disturbances εt : For t = 2, , n one can write εt = ρεt−1 + v t with an unknown nonrandom ρ, and the v t are well behaved,... described Then there is the Goldfeld-Quandt test: if it is possible to order 1252 58 UNKNOWN PARAMETERS IN THE COVARIANCE MATRIX the observations in order of increasing error variance, run separate regressions on the portion of the date with low variance and that with high variance, perhaps leaving out some in the middle to increase power of the test, and then just making an F-test SSE /d.f with SSEhigh/d.f... 58.2 AUTOCORRELATION 1255 This estimator has become very fashionable, since one does not have to bother with estimating the covariance structure, and since OLS is not too inefficient in these situations It has been observed, however, that this estimator gives too small confidence intervals in small samples Therefore it is recommended in small samples to multiply ε2 ˆ 2 the estimated variance by the factor... Properties of OLS in the presence of autocorrelation If the correlation between the observations dies off sufficiently rapidly as the observations become further apart in time, OLS is consistent and asymptotically normal, but inefficient There is one important exception to this rule: if the regression includes lagged dependent variables and there is autocorrelation, then OLS and also GLS is inconsistent 1256... unknown nonrandom parameters, and Z =   consists zn of observations of m nonrandom explanatory variables which include the constant “variable” ι The variables in Z are often functions of certain variables in X, but this is not necessary for the derivation that follows 2 A special case of this specification is σt = σ 2 xp or, after taking logarithms, t 2 2 ln σt = ln σ + p ln xt Here Z = ι ln x and α... ln(1−ρ2 ) And since Ψ−1 = 2 (1−ρ2 ) = P P and σε ln(1−ρ v 2 2 2 2 1−ρ2 2 σv , the last terms coincide too, because (y − Xβ) Ψ−1 (y − Xβ) = (y − Xβ) P P (y − Xβ) 1268 58 UNKNOWN PARAMETERS IN THE COVARIANCE MATRIX 2 • i 4 points Show that, if one concentrates out σv , i.e., maximizes this likelihood 2 function with respect to σv , taking all other parameters as given, one obtains n (58.2 .29) ln conc... 2 Now in the constrained case, with homoskedasticity assumed, Ψ = I and we will ˆ ˆ2 ˆ2 ˆ ˆ ˆ ˆ ˆ ˆ ˆ write the OLS estimator as β and σ = (ε ε)/n Then ln det[σ I] = n ln[σ 2 ] Let ˆ ˆ ˆ be the unconstrained MLE, and β (58.1.5) (58.1.6) σ2 I ˆ ˆ Ψ= 1 O O σ2 I ˆ2 there σi = εi εi /ni The LR statistic is therefore (compare [Gre97, p 516]) ˆ2 ˆ ˆ (58.1.7) ˆ λ = 2(log fconstrained − log funconstrained . Clearly R [ΨX] ⊂ R [X] since ΨX = XA and every XAc has the form Xd with d = Ac. And since Ψ is nonsingular, and the range space is the space spanned by the column vectors, and the columns of ΨX. desired formula for ω 2 and for the diagonal elements σ 2 = τ 2 +ω 2 . Solving this for τ 2 and plugging in the formula for ω 2 gives τ 2 = σ 2 −ω 2 = σ 2 (1−ρ),  • c. 3 points Using matrix identity. 1000 families in a certain town, and denote the income of family k by z k . Let µ = 1 1000  1000 k=1 z k be the population average of all 1000 incomes in this fin ite population, and let σ 2 = 1 1000  1000 k=1 (z k −

Ngày đăng: 04/07/2014, 15:20

Xem thêm: Class Notes in Statistics and Econometrics Part 29 ppsx