Limiting Distributions for Static GLSEM

Một phần của tài liệu Topics in advanced econometrics (Trang 82 - 89)

Jointly Normal Errors

We examine the simplest case of a static model with jointly normal errors.

Inmany ways this is a rather simple problem, since the most difficult aspect of this derivation was bypassed in the transition to Eq. (2.3). Still operating within the framework of assumptions (A.l) through (A.5) and bearing in

64 2. Extension of Classical Methods II

mind that the matrix X consists only of its exogenous component P, we add the further requirement that, for all t,

u~. "" N(O,E).

Since the matrix of structural errors is given by U = (Ut.) , t = 1,2, ...,T , we have that

vec(U) = U

implies that

Cov(U) =E 129Ir =<P.

Consequently, for every T

( X IX )

~:r ""N 0, E 129r- '

and letting T ----+ 00 we obtain1

~:r ~ ~""N(O,E129 Mx x ) .

Since the 2SLS and 3SLS estimators are, asymptotically, simple linear transformations of the random vector ~ we conclude

where,

vr(i5 - 6hsLS

vr(i5 - 6hsLS (2.7)

(2.8) Nonnormal Errors

In this section we shall derive the limiting distribution of the estimators, dispensing with the normality assumption, but still maintaining that X contains only its exogenous component. The complication intro- duced is that we no longer know what is the distribution of ~:r and, thus, we need to invoke an appropriate central limit theorem (CLT). Unfortu- nately, in order to do so in a responsible way, and without excessive "hand waving", we need to introduce more structure into our discussion. The structure required is treated extensively2 in Dhrymes (1989). The basic

1The notation ~f ~ ~,below, is to be read "the random vector ~f converges in distribution to the random vector ~"; the notation ~""N(O,8) is to be read in the usual fashion, i.e. "the random vector ~ has the multivariate normal distribution with mean zero and covariance matrix 8".

2Indeed, the volume Topics in Advanced Econometrics: Probability Founda- tions, New York: Springer-Verlag, 1989, was written explicitly for the purpose of providing a readily accessible reference to the underlying mea sure and probabil- ity theory required for most problems likely to arise in classical econometrics. The volume presupposes prior training in analysis and a modicum of mathematical maturity. Otherwise, it attempts to be reasonably self-contained.

random elements of the problem are the random vectors {u~. : t ~ I};

the context also includes the exogenous sequence {x~. :t ~ I} and the parameter space, which consists of the admissible values of the triplet (B* , C, I.:). Thus, let (n, A, P) be a (sufficiently large) probability space3

such that the sequence {u~. :t ~ I} may be defined on it and such that it also contains the collection of "events" corresponding to the sequence

{~T :T ~ I} ,i.e. ~T is A-measurable, for every T. Let

At = a(u~.,s :<::::t), (2.9) i.e. At is the a -algebra generated by the first t random vectors. It is clear that A t-l C At C A, and that ~~. is At -measurable. Moreover, if AtT is the a -algebra corresponding to ~,then ~~T. is AtT-measurable and At-l,T C A tT .In this probabilistic framework we need to apply a CLT on the triangular array ~T = 'L.i'=l~tT. as T ----+00 .To better understand the structure of the problem confronting us, and using the results in Ch.

4, Sec. 2, of Dhrymes (1984), we can write

~tT. =

(I0 X')u 1 , I T ,

VT = m:;vec(X U) = m:;'2:vec(xt.ud

T yT -rtt=l

T T

Jr ~(I0 x~.)u~. = ~~~T.

VT~t.' ~~.1 = (I0x~.)u~.

(2.10) (2.11) and we see that, in view of assumption (A.5) in Chapter 1, we require a CLT for independent, not identically distributed random variables. This is so since the summands in Eq. (2.10) obey

E(~~.)= 0, and, moreover,

(2.12) We further note from Proposition 34, Ch. 4 in Dhrymes (1989), that for a sequence of random vectors, {Xn :n ~ I} ,

3In (n, A, P), n is said to be the sample space, which remains a primitive concept, A is a a-algebra of its subsets and P is a probability measure defined on A.

(tT =~tT.A, if and only if for any conformable real vector A

A'x;~ A'X.

To apply this setup to our context, let

T

(T = (P =l:(tT,

t=l

(2.13) and note that the random variables (tT constitute, for each T ,a sequence of independent, not identically distributed, square integrable (i.e. having finite second moments), AtT -measurable, random variables obeying

E((tT) = 0,

(2.14) where

~ 2 I(XIX) '( )

~(jtT = A ~Q9 ---r- A---+A ~Q9Mx x A ::::: O.

It follows, then, from Proposition 45, Ch. 4, in Dhrymes (1989), that to prove that (T converges in distribution, it will suffice to show that the sequence above obeys a Lindeberg condition.

Lemma 1. Given the context provided in the informal discussion above, the sequence {(tT :t :ST, T ::::: I} obeys a Lindeberg condition, i.e. if we define

(2.15) then,

Proof: Note that

lim LT

T---.oo O. (2.16)

I(tT 1

2 :S ~ IA1211I Q9x~. 1121u~. 12,

where the notation II A II indicates the norm of a matrix, defined as the square root of tr(AA') .Note that

define

{ w :I(tT I> ~ }

{ W :I u~. I> r IAI (2.17)

and observe that A tTl C A tT 2 ,owing to the fact that

1 1

- < 1(tT I:::; ;;::r; 1..\ I [m(Xt.x~.)](1/2) Iu~. I .

r yT

Consequently, the integral in Eq. (2.15) may be evaluated as

Even though we have simplified considerably the representation of the inte- gral of the Lindeberg condition, we are still not able show that it vanishes asymptotically. Further simplifications are required. We note, however, that we have not used the assumption in (A.5), i.e. the i.i.d. assumption re- garding the basic structural errors, nor have we used assumption (A.l) or (A.la), viz. that the limit of (X'X/T) is well behaved.4 The fact that the structural errors are i.i.d. means that we can remove the subscript t from the integrand; the fact that (X' X/T) ---+ Mx x has implications that are derived in the appendix to this chapter. In particular, it implies that if we put

I

2 maxt<T Xtã Xtã

aT = T then aT--+0,

in the same mode of convergence as the matrix above.

Now, define

AT = {w : I u~. I > ~ I ..\ I m~1/2)aT} ,

and note that A tT 2C AT .It follows, therefore, that

(2.18) The conclusion follows immediately, if we note that as T ---+00 the inte- gral above converges to zero, owing to the finiteness of the second moments

4 Itwould appear that, in order to prove the consistency and asymptotic nor- mality of the 2SLS and 3SLS estimators, the minimum set of conditions we can place on the exogenous variables is that d~---->00 and

. Xt.X~. 2

lim sup- d2 =0, where dT =trX' X.

T~OOt~T T

In such a case we would not have dealt, for example, with ..;T(15 - 8hsLS , but rather with dT(15 - 8hsLS , and (i., would not have been divided by ..;T, but rather by dT .It has been traditional in this literature, however, to assume that the second moments of explanatory variables are well defined both in finite samples and in the limit and this fact accounts for the normalization by ..;T.

of the structural errors and the fact that for every .x and m, the trace [in the right member of Eq. (2.18)] converges to m trMx x <00 .

q.e.d, Corollary 1. Under the conditions of Lemma 1,

c '" N(O,M).

Proof: By Proposition 45, Ch. 4, in Dhrymes (1989),

Since (T = ~~.x,it further follows by Proposition 34, in the same chapter above, that

q.e.d.

A Digression

Ifthe reader is comfortable with the integrations performed above, it is quite unnecessary to read this section. Its purpose is to show the rela- tionship between the Lebesgue integrals, above, in the context of taking expectations, with respect to a probability measure, and the Riemann or Riemann-Stieltjes integrals performed with respect to densities or cumula- tive distribution functions.

In the context of the discussion in the preceding sections, we looked at random variables (r.v.) as real valued measurable functions, say X, such that

X:O--+R,

and their expectation was defined by

E(X) = 1X(w) dP(w) or 1X(w) P(dw).

Roughly speaking, what we do in such integrals is to look at the range of the function, i.e. the values assumed by the function and, choosing a partition thereof, i.e. choosing a set of disjoint intervals covering the range, we find their inverse images. These are sets in the domain of the function that give rise to values assumed by the function in the chosen (range) intervals. We then take a representative value in each interval, multiply by the measure of its inverse image and sum. We do this for all possible partitions. The value of the integral is defined to be the unique value of this sum, if it exists, as the number of intervals in the partition tend to infinity, over all possible partition schemes.

The expected value of a r.v. above, was represented as a Lebesgue in- tegral; the reader, however, is more apt to be familiar with expectation as

E(X) = I: ~ dF(~), or I: U(~) .u;

which we shall consider a single class, the class of Riemann-Stieltjes inte- grals. In the first representation the random variable is not assumed to have a density function, only a cumulative distribution function (cdf), F, and the integral is taken to be the Riemann-Stieltjes integral; in the second rep- resentation, F is assumed to be differentiable with density function f, so that "dF = f dx", and the integral is taken to be the ordinary Riemann in- tegral of elementary calculus. Since we have given two basic representations of the same entity, E(X) ,the Lebesgue and Riemann-Stieltjes versions of the integral "ought" to give us the same value. Without taking up the fine subtleties of the two procedures we give below the relationship between them in the case where both integrals make sense.5

Given a LV., say X, defined on the probability space (0, A, P), with values in the Borel space (R, B), it (the r.v.) induces a probability measure, say Px 6 on the Borel space, such that for every B EB ,

Px(B) =P(A),

where A = X-I(B) ,i.e. A is the inverse image of B under X. Since X(O) = R, and X induces the probability measure Px on the Borel space (R, B), if we make the change in variable,

~= X(w), we find

LX(w)dP = l ~dPxã

To round out our discussion, we need only show the connection between the probability distribution (measure), Px ,and the cdf, say Fy ,Again, without going into the mathematical details of the matter, define for the special sets B = (-00, xl, for x E R, the point function,

Fx(x) = Px(B).

5Since mathematicians do not go out of their way to invent redundancies, the reader would surmise that sometimes the Riemann and/or Riemann-Stieltjes integral fail to exist, where in some fundamental sense they "ought to exist".

In fact, this was the motivation for the Lebesgue integral which coincides with the Riemann-Stieltjes integral, when both exist and, moreover, it is defined in situations where the latter is not.

6This probability measure is also termed the probability distribution of the

LV., and is to be distinguished from the cdf of the r.v. The connection between the two will be shown below.

Since the class of sets {B : B = (-00, x], x E R} can be shown to generate the (Borel) (Y-algebra B, we can write

1 ~dPx = i: ~ dFx(~).

Moreover, if the original probability measure, P, has a density, i.e. if it is absolutely continuous with respect to a (Y-finite measure, say fJ,then by the Radon-Nikodym theorem, Proposition 26, Ch.1 in Dhrymes (1989), there exists a nonnegative measurable function, say ¢,such that for every set A EA,

P(A) = i ÂdfJã

To relate this to our previous discussion, we note that the change in vari- able, ~ =X(w) ,together with the relation A =X~l(B), implies, for the special sets B = (-00, x], xE R,

P(A) = Px(B) = iXoofx(O.u;= Fx(x),

where fx(~)= ¢[X-l(~)] .Consequently, in the case of absolutely contin- uous measures, we have the relation,

l X dP = i: Ux(~) .u;

where fx is the density function of the r.v. X; thus, the expectation of the latter is rendered in the familiar form, which is shown to be equivalent to the form required in the context of the probability space (n, A, P).

Một phần của tài liệu Topics in advanced econometrics (Trang 82 - 89)

Tải bản đầy đủ (PDF)

(424 trang)