1. Trang chủ
  2. » Tài Chính - Ngân Hàng

Class Notes in Statistics and Econometrics Part 31 ppt

29 122 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 29
Dung lượng 378,55 KB

Nội dung

CHAPTER 61 Random Coefficients The random coefficient model first developed in [HH68] cannot be written in the form y = Xβ +ε ε ε because each observation has a different β. Therefore we have to write it observation by observation: y t = x t  β t (no separate disturbance term), where β t = ¯ β + v t with v t ∼ (o, τ 2 Σ Σ Σ). For s = t, v s and v t are uncorrelated. By re-grouping terms one gets y t = x t  ¯ β + x t  v t = x t  ¯ β + ε t where ε t = x t  v t , hence var[ε t ] = τ 2 x t  Σ Σ Σx t , and for s = t, ε s and ε t are uncorrelated. 1303 1304 61. RANDOM COEFFICIENTS In tiles, this model is (61.0.32) y t = X k B ∆ t ; k B t = k ¯ β ι t + k V t Estimation under the assumption Σ Σ Σ is known: To estimate ¯ β one can use the het- eroskedastic model with error variances τ 2 x t  Σ Σ Σx t , call the resulting estimate ˆ ¯ β. The formula for the best linear unbiased predictor of β t itself can be derived (heuristi- cally) as follows: Assume for a moment that ¯ β is known: then the model can be written as y t − x t  ¯ β = x t  v t . Then we can use the formula for the Best Linear Predictor, equation (??), applied to the situation (61.0.33)  x t  v t v t  ∼  0 o  , τ 2  x t  Σ Σ Σx t x t  Σ Σ Σ Σ Σ Σx t Σ Σ Σ  where x t  v t is observed, its value is y t − x t  ¯ β, but v t is not. Note that here we predict a whole vector on the basis of one linear combination of its elements only. 61. RANDOM COEFFICIENTS 1305 This predictor is (61.0.34) v ∗ t = Σ Σ Σx t (x t  Σ Σ Σx t ) −1 (y t − x t  ¯ β) If one adds ¯ β to both sides, one obtains (61.0.35) β ∗ t = ¯ β + Σ Σ Σx t (x t  Σ Σ Σx t ) −1 (y t − x t  ¯ β) If one now replaces ¯ β by ˆ ¯ β, one obtains the formula for the predictor given in [JHG + 88, p. 438]: (61.0.36) β ∗ t = ˆ ¯ β + Σ Σ Σx t (x t  Σ Σ Σx t ) −1 (y t − x t  ˆ ¯ β). Usually, of course, Σ Σ Σ is unknown. But if the numb er of observations is large enough, one can estimate the elements of the covariance matrix τ 2 Σ Σ Σ. This is the fact which gives relevance to this model. Write τ 2 x t  Σ Σ Σx t = τ 2 tr x t  Σ Σ Σx t = τ 2 tr x t x t  Σ Σ Σ = z  t α, where z t is the vector containing the unique elements of the symmetric matrix x t x t  with those elements not located on the diagonal multiplied by the factor 2, since they occur twice in the matrix, and α contains the corresponding unique ele- ments of τ 2 Σ Σ Σ (but no factors 2 here). For instance, if there are three variables, then τ 2 x t  Σ Σ Σx t = x 2 t1 τ 11 + x 2 t2 τ 22 + x 2 t3 τ 33 + 2x t1 x t2 τ 12 + 2x t1 x t3 τ 13 + 2x t2 x t3 τ 23 , where τ ij are the elements of τ 2 Σ Σ Σ. Therefore z t consists of x 2 t1 , x 2 t2 , x 2 t3 , 2x t1 x t2 , 2x t1 x t3 , 2x t2 x t3 , and α  = [τ 11 , τ 22 , τ 33 , τ 12 , τ 13 , τ 23 ]. Then construct the matrix Z which has as its tth row the vector z  t ; it follows V [ε ε ε] = diag(γ) where γ = Zα. 1306 61. RANDOM COEFFICIENTS Using this notation and defining, as usual, M = I − X(X  X) −1 X  , and writing m t for the tth column vector of M, and furthermore writing Q for the matrix whose elements are the squares of the elements of M , and writing δ t for the vector that has 1 in the tth place and 0 elsewhere, one can derive: E[ˆε 2 t ] = E[(δ  t ˆε) 2 ](61.0.37) = E[ˆε  δ t δ  t ˆε](61.0.38) = E[ε ε ε  Mδ t δ  t Mε ε ε](61.0.39) = E[tr M δ t δ  t Mε ε εε ε ε  ](61.0.40) = tr M δ t δ  t M diag(γ) = tr m t m  diag(γ)(61.0.41) = m  t diag(γ)m t = m t1 γ 1 m 1t + ··· + m tn γ n m nt (61.0.42) = q  t γ = q  t Zα(61.0.43) E[ˆε 2 ] = QZα,(61.0.44) where α is as above. This allows one to get an estimate of α by regressing the vector [ˆε 2 1 , . . . , ˆε 2 n ]  on QZ, and then to use Zα to get an estimate of the variances τ 2 x  Σ Σ Σx. Unfortunately, the es timated covariance matrix one gets in this way may not be nonnegative definite. 61. RANDOM COEFFICIENTS 1307 [Gre97, p. 669–674] brings this model in the following form: (61.0.45) t Y n = t X k B ∆ n + t E n ; k B t = k ¯ β ι t + k V t Problem 513. Let y i be the ith column of Y . The random coefficients model as discussed in [Gre97, p. 669–674] specifies y i = X i β i + ε ε ε i with ε ε ε i ∼ (o, σ 2 i I) and ε ε ε i uncorrelated with ε ε ε j for i = j. Furthermore also β i is random, write it as β i = β + v i , with v i ∼ (o, τ 2 Γ) with a positive definite Γ, and again v i uncorrelated with v j for i = j. Furthermore, all v i are uncorrelated with all ε ε ε j . • a. 4 points In this model the disturbance term is really w i = ε ε ε i + Xv i , which has covariance matrix V [w i ] = σ 2 i I + τ 2 X i ΓX  i . As a preliminary calculation for the next part of the question show that (61.0.46) X  i ( V [w i ]) −1 = 1 τ 2 Γ −1 (X  i X i + κ 2 i Γ −1 ) −1 X  i 1308 61. RANDOM COEFFICIENTS where κ 2 i = σ 2 i /τ 2 . You are allowed to use, without proof, formula (A.8.13), which reads for inverses, not generalized inverses: (61.0.47)  A + BD −1 C  −1 = A −1 − A −1 B(D + CA −1 B) −1 CA −1 Answer. In (61.0.47) set A = σ 2 i I, B = X i , D −1 = τ 2 Γ, and C = X  i to get (61.0.48) ( V [w i ]) −1 = 1 σ 2 i (I − X i (X  i X i + κ 2 i Γ −1 ) −1 X  i ) Premultiply this by X  i and add and subtract the same term: (61.0.49) X  i ( V [w i ]) −1 = 1 σ 2 i X  i − 1 σ 2 i (X  i X i + κ 2 i Γ −1 − κ 2 i Γ −1 )(X  i X i + κ 2 i Γ −1 ) −1 X  i (61.0.50) = 1 σ 2 i X  i − 1 σ 2 i X  i + 1 σ 2 i κ 2 i Γ −1 (X  i X i + κ 2 i Γ −1 ) −1 X  i = 1 τ 2 Γ −1 (X  i X i + κ 2 i Γ −1 ) −1 X  i .  • b. 2 points From (61.0.46) derive: (61.0.51)  X  i ( V [w i ]) −1 X i  −1 = σ 2 i (X  i X i ) −1 + τ 2 Γ Answer. From (61.0.46) follows (61.0.52) X  i ( V [w i ]) −1 X i = 1 τ 2 Γ −1 (X  i X i + κ 2 i Γ −1 ) −1 X  i X i 61. RANDOM COEFFICIENTS 1309 This is the product of three matrices each of which has an inverse: (61.0.53)  X  i ( V [w i ]) −1 X i  −1 = τ 2 (X  i X i ) −1 (X  i X i +κ 2 i Γ −1 )Γ = (τ 2 I+σ 2 i (X  i X i ) −1 Γ −1 )Γ = τ 2 Γ+σ 2 i (X  i X i ) −1 .  • c. 2 points Show that from (61.0.46) also follows that The GLS of each column of Y separately is the OLS ˆ β i = (X  i X i ) −1 X  i y i . • d. 2 points Show that V [ ˆ β i ] = σ 2 i (X  i X i ) −1 + τ 2 Γ. Answer. Since V [y i ] = σ 2 i I+τ 2 X i ΓX  i , it follows V [ ˆ β i ] = σ 2 i (X  i X i ) −1 +τ 2 (X  i X i ) −1 X  i X i ΓX  i X i (X  i X i ) −1 = as postulated.  • e. 3 points [Gre97, p. 670] describes a procedure how to estimate the covariance matrices if they are unknown. Explain this procedure clearly in your own words, and spell out the conditions under which it is a consistent estimate. Answer. If Γ is unknown, it is possible to get it from the sample covariance matrix of the group-specific OLS estimates, as long as the σ 2 i and the X i are such that asymptotically 1 n−1  ( ˆ β i − ¯ ˆ β)( ˆ β i − ¯ ˆ β)  is the same as 1 n  ( ˆ β i − β)( ˆ β i − β)  which again is asymptotically the same as  1 n V [ ˆ β i ]. We also need that asymptotically  1 n s 2 i (X  i X i ) −1 =  1 n σ 2 i (X  i X i ) −1 . If these substitutions can be made, then plim 1 n−1  ( ˆ β i − ¯ ˆ β)( ˆ β i − ¯ ˆ β)  −  1 n σ 2 i (X  i X i ) −1 = τ 2 Γ, since  1 n V [ ˆ β i ] = τ 2 Γ +  1 n σ 2 i (X  i X i ) −1 . This is [Gre97, (15-29) on p. 670].  1310 61. RANDOM COEFFICIENTS Problem 514. 5 points Describe in words how the “Random Coefficient Model” differs from an ordinary regression model, how it can be estimated, and describe situations in which it may be appropriate. Use your own words instead of excerpting the notes, don’t give unnecessary detail but give an overview which will allow one to decide whether this is a good model for a given situation. Answer. If Σ Σ Σ is known, estimation proceeds in two steps: first estimate ¯ β by a heteroskedastic GLS model, and then predict, or better retrodict, the a ctua l value taken by the β t by the usual linear prediction formulas. But the most important aspect of the model is that it is possible to estimate Σ Σ Σ if it is not known! This is possible because each v t imposes a different but known pattern of heteroskedasticity on the error terms, it so to say leaves its footprints, and if one has enough observations, it is possible to reconstruct the covariance matrix from these footprints.  Problem 515. 4 points The specification is (61.0.54) y t = α + β t x t + γx 2 t (no separate disturbance term), where α and γ are constants, and β t is the tth element of a random vector β ∼ (ιµ, τ 2 I). Explain how you would estimate α, γ, µ, and τ 2 . Answer. Set v = β − ιµ; it is v ∼ (o, τ 2 I) and one gets (61.0.55) y t = α + µx t + γx 2 t + v t x t 61. RANDOM COEFFICIENTS 1311 This is regression with a heteroskedastic disturbance term. Therefore one has to specify weights=1/x 2 t , if one does that, one gets (61.0.56) y t x t = α x t + µ + γx t + v t the coefficient estimates are the obvious ones, and the variance estimate in this regressio n is an unbiased estimate of τ 2 .  [...]... Equations Systems,” in which the dependent variable in one equation may be the explanatory variable in another equation 62.2 Multivariate Regression with Equal Regressors The multivariate regression model with equal regressors reads (62.2.1) Y = XB + E where we make the following assumptions: X is nonrandom and observed and Y is random and observed, B is nonrandom and not observed E is random and not observed,... least squares objective function) consists in running both the constrained and the unconstrained multivarate regression, and then determining how far the achieved values of the GLS objective function (which are matrices) are apart from each other Since the univariate t-test and its multivariate generalization, called Hotelling’s T , is usually only applied in hypotheses where R is a row vector, we will... of this model is given in chapter 63 62.2.5 Testing We will first look at tests of hypotheses of the form RB = U This is a quite specialized hypothesis, meaning that each column of B is subject to the same linear constraint, although the values which these linear combinations take may differ from column to column Remember in the univariate case we introduced several testing principles, the Wald test,... introduced several testing principles, the Wald test, the likelihood ratio test, and the Lagrange multiplier test, and showed that in the linear model they are equivalent These principles can be directly transferred to the multivariate case The Wald test conˆ sists in computing the unconstrained estimator B, and assessing, in terms of the ˆ (estimated, i.e., “studentized”) Mahalanobis distance, how far... ˆ and it is independent of B Let us look at the simplest example, in which X = ι Then B is a row vector, write it as B = µ , and the model reads (62.2.18) Y = ιµ + E, 62.2 MULTIVARIATE REGRESSION WITH EQUAL REGRESSORS 1323 in other words, each row y of Y is an independent drawing from the same µ, Σ distribution, and we want to estimate µ and Σ , and also the correlation coefficients An elementary and. .. specific kind of hypothesis, namely, a hypothesis of the form r B = u Now let us turn 1326 62 MULTIVARIATE REGRESSION to the more general hypothesis RB = U , where R has rank i, and apply the F test principle For this one runs the constrained and the unconstrained multivariate ˆ ˆ regression, calling the attained error sum of squares and products matrices E 1 E 1 ˆ ˆ (for the constrained) and E E (for... degrees of freedom, but divides the determinant of the unconstrained error sum of squares and products matrix by the determinant of the total sum of squares and products matrix This gives a statistic whose distribution is again independent of Σ: Definition of Wilks’s Lambda: if W 1 ∼ Wr (k1 , Σ) and W 2 ∼ Wr (k2 , Σ) are independent (the subscript r indicating that Σ is r × r), then (62.2.22) |W 1 |... Here Σ −1/2 z ∼ N (o, I) and from W = Y Y where each row of Y is a N (o, Σ ), Σ then Σ −1/2 SΣ −1/2 = U U where each row of U is a N (o, I) From the interpretation of the Mahalanobis distance as the number of standard deviations the “worst” linear combination is away from its mean, Hotelling’s T 2 -test can again be interpreted as: make t-tests for all possible linear combinations of the components... at an appropriately less stringent significance level, and reject the 62.2 MULTIVARIATE REGRESSION WITH EQUAL REGRESSORS 1325 hypothesis if at least one of these t-tests rejects This principle of constructing tests for multivariate hypotheses from those of simple hypotheses is called the “unionintersection principle” in multivariate statistics Since the usual F -statistic in univariate regression can... satisfy 1314 62 MULTIVARIATE REGRESSION the same kind of linear constraint, one gets a model which is sometimes called a growth curve models These models will be discussed in the remainder of this chapter In a second basic model, the explanatory variables are different, but the coefficient vector is the same In tiles: t t Y X k t β = (62.1.2) E + p p p These models are used for pooling cross-sectional and . following assumptions: X is nonrandom and observed and Y is random and observed, B is nonrandom and not obse rved. E is random and not observed, but we know E [E] = O and the rows of E are indepe. satisfy 1313 1314 62 . MULTIVARIATE REGRESSION the same kind of linear constraint, one gets a model which is sometimes called a growth curve models. These models will be discussed in the remainder. Zα. 1306 61. RANDOM COEFFICIENTS Using this notation and defining, as usual, M = I − X(X  X) −1 X  , and writing m t for the tth column vector of M, and furthermore writing Q for the matrix whose elements

Ngày đăng: 04/07/2014, 15:20

TỪ KHÓA LIÊN QUAN