Limiting Distributions for Dynamic GLSEM

Generalities

Ifthe GLSEM we are dealing with is dynamic, i.e. if it contains lagged dependent variables, then whether the structural errors are jointly normal or not, we face a very serious complication. To appreciate the nature of the difficulty refer to Eqs. (2.10) and (2.11) of the preceding discussion. Ifthe elements of Xt. are all exogenous then the summands of Eq. (2.10), i.e, the vectors ~~. = (u~. 181x~.) constitute a sequence of independent, but not identically distributed random elements. If, on the other hand,

Xtã = (Ptã,Yt-lã,Yt-2., ... ,Yt-k.)

so that the model is dynamic, then the vectors ~t., above, constitute a sequence of dependent random elements. Consequently, in order to handle the limiting distribution problem in this case we need to invoke a CLT for dependent random variables and, moreover, we need to obtain the final form of the model. The latter is the task to which we now turn.

Lag Operators

The discussion here will, in many respects, duplicate the material presented in Ch. 5, Sec. 2, of Dhrymes (1984) under the title "Vector Difference Equations with Constant Coefficients". The reader desiring greater detail is referred to that source. In the current discussion this brief review is intended, mainly, to establish notation. We recall that a lag operator, denoted here by L is defined by the property

(2.19) Powers of the operator denote repeated application of the operator above, i.e.

(2.20) The identity operator, I, and the zero operator, 0, are defined, respectively, by the operations below, for all functions xô,

IXt = Xt, (2.21)

Generally these last two operators will not be written out explicitly, unless the context requires it for clarity. Finally note that by convention we define the zeroth power of L to be the identity operator, i.e.

LO = I, or, alternatively, LOXt =Xt.

The lag operator is, in addition, linear in the sense that

and obeys the law of exponents, i.e.

(2.22)

Thus, polynomials in the operator L,with real or complex coefficients, for example aoI+a-L:+a2L2 , have a perfectly straightforward meaning. In fact, it may be shown that the algebra of polynomials of degree n ~ 0 in the lag operator L over the field of real or complex numbers (i.e. with real or complex coefficients) is isomorphic to the algebra of polynomials in the real or complex indeterminate, say t. What this means is that if an operation is desired to be performed on polynomials in the lag operator, we may first replace L and its powers by the real indeterminate, say t, perform the desired operation and then replace t and its powers by the operator L and its powers. The resulting entity is the product of the (originally desired) operation involving the two polynomials in the lag operator. A few examples may help clarify matters.

Example 1. Consider the polynomials

Ai(L)='L,ajiLj, i=1,2, ... ,m.

j=l Evidently,

Ai(L)xt = 'L,ajiXt-j.

j=l

The sum of any two polynomials, say As (L)+A k(L) , on the assertion that

ti; ::; nk , is given by

ns nk

Ask(L) ='L,(ajs+ajk)Lj+ 'L, ajk L j.

j=l j=ns+1

To illustrate multiplication, let n1 = 2, n2 =3 ; then

Example 2. Suppose

A(L)=1-AL.

What meaning is to be ascribed to the entity (I-AL)-l ,by which we wish to denote the operator inverse to A(L) = (I - AL) , i.e, the operator obeying

A(L)(I - AL)-l = I.

By the discussion just above we may write, formally,

(I - AL)-1 = 'L,AjLj ,

j=O

and the question remains as to how we are to interpret the right member above, and whether there are any restrictions on the circumstances that guarantee the validity of this interpretation. To explore the issues that underlie this question put

and apply the operator (I - AL) to both sides to obtain Yt = AYt-1 +Xt,

(2.23)

(2.24) which is simply a first order difference equation with forcing function xô . Now suppose the x -sequence obeys

Xt 0 for t <0

x for t > O.

In Eq. (2.24), solving the difference equation with this particular forcing function yields

t-1 Yt= AtYo+ xLAj

. j=O

where Yo indicates the "initial" condition, i.e. the value assumed by the process, y, at "time zero"; in this particular case it so happens that Yo=

ii ,but this need not be the case at all times. Now, if IAI< 1 the role of initial conditions declines in significance as we move away from "time zero" . On the other hand if IAI ~ 1 then the significance of initial conditions either remains undisturbed or is actually magnified as we move away from

"time zero" . An extreme illustration of the consequences of such parametric specification is afforded by the following x -sequence:

Xt 0 for t <0

x for t= 0

o for t >0, which yields the solution,

Yt= AtYo.

If I A I ~ 1, we see that even though the x -process departed from its null state only momentarily, it has induced on the y-process a permanent change in behavior; in fact if IAI > 1 the temporary departure in the x- process has changed the y-process from a null state to one that exhibits its logarithm as a pure time trend, when A is real!In this volume, we rule out such situations and we shall therefore insist, that IAI < 1, which means that when we consider the characteristic polynomial (1 - AZ)= 0 (with

complex z), we shall consider admissible only parametric specifications of A that permit convergence for Z on the unit circle, i.e. for IZ I = 1;

specifically, we require IAI < 1. With this interpretation we can now write

(I - AL)-1 = L Aj Lj

j=O

and the meaning of (I - AL)-1x; is quite unambiguous.

Before we leave this topic it should be pointed out that the operator framework we have just established is quite well suited to producing formal solutions for difference equations. For example, let Yt. and Xt. be m - and n -element vectors respectively and suppose they obey

Ytã = Yt-1.A' +Xt.B'.

A formal solution is given, immediately, as

(2.25)

74 2. Extension of Classical Methods II

provided the matrix A is stable, i.e. it has roots less than unity in absolute value. Ifso, then we have the explicit representation

(I-AL)-l=LAjLj, Aj=PAjp-1 j=O

where P is the matrix7 of characteristic vectors and A is the (diagonal) matrix of the characteristic roots of the matrix A. The particular solution of Eq. (2.26) may then be represented operationally as

Y'tã = "~PAlr:'Bx't-Jã..

j=O The Final Form of a Dynamic GLSEM

When the model is dynamic, the vector Xt. is the vector of predeter- mined variables and, as such, it contains both lagged dependent and exogenous variables, i.e. we may write, in a slight departure from our earlier custom, Xtã = (Pt., Yt-1., Yt-Zã, ... , Yt-k.)' This implies that the maximal lag contained in the structural model is k and that the exogenous variables of the model are contained in the s-element vector, Pt . .To ensure maximal compatibility of notation, relative to the static case, par- tition the matrix C,in the structural representation Yt.B* = Xt.C+Ut. , as

c' = (C~, C~, ... ,C~)

so that we may write the structural model as

Yt. B* = PtãCo+Yt-1. C1+Yt-z.Cz+ ... +Yt-k.Ck+Utãã (2.26)

(2.27) Using the operator notation developed above, and writing the system in column vector form, we find

*" '""'"" J • I " I

B u; = c:CjUYt. +COPt. +ut .ã

j=l

Multiplying through by the transpose of B*-l = D, i.e. obtaining the reduced form, we have

II(L)y~. II~p~.+v~.

rr(L) (1 -t,rrjLJ)

IIj = D'C;, j =O,1,2, ... ,k, Vt. =Ut.D.

7The matrix P will be nonsingular if A has distinct roots, which is assumed for the purposes of this discussion.

Assuming stability, and abstracting from initial conditions, the final form of the model may be found by inverting the operator IT(L) ,which is simply a matrix whose typical element is a polynomial of degree at most k in the lag operator L. In view of the isomorphism alluded to above, we may find the inverse of the operator IT(L) ,constructively, i.e.

by obtaining theadjointof the matrix IT(L) and dividing each element by the determinant b(L)= IIT(L) I.Doing so yields,

I H(L) I G(L) I

Ytã = b(L) Ptã+ b(L)Utã, (2.28) where

A(L) b(L) H(L)

adjoint ofIT(L)

IIT(L) I

A(L)IT~, G(L) =A(L)n'. (2.29)

It only remains to give meaning to the operator (I/b(L)). But this is rather simple in terms of our discussion in Example 2. First, consider the polynomial equation

b*(z) = z"+b1zn-1+b2zn-2 + ... +bn=0, n =mk. (2.30) in the complex indeterminate, z. The order of the polynomial is n = mk , which is verified by noting that each element of IT(L) is a polynomial of degree at most k, and the determinant consists of the sum of all possible products formed by taking one term from each row (or column). Since IT(L) is an m x m matrix the conclusion follows immediately. Now, let

Zi, i = 1,2, ... ,n be the roots of the equation above. By the fundamental theorem of algebra we can write

b*(z) = II(z - Zj).

j=1

The characteristic function8 of the difference equation describing the dynamic GLSEM, however, is given not by Eq. (2.30) but by

b(z) = 1+bIZ+b2z2+ ... +bnzn. (2.31) It is a further remarkable result from algebra that the roots of Eq. (2.31) are simply the inverse of the roots of Eq. (2.30), i.e. that the roots of the equation above are given by

Aj = -,1 j = 1,2, ....n:

8 When one states that an equation or a system is stable one generally means that the roots of the characteristic function or the characteristic equation of the system are less than unity in absolute value.

We may therefore write

b(z) = II(1-Ajz),

j=1

and, using the isomorphism alluded to above, conclude that

n I n

b(L) = II(I - AjL), and therefore b(L) = II(I - AjL)-1.

j=1 j=1

The meaning of the last representation, however, had been made quite clear in Example 2. Thus, provided the model is stable, there is no ambiguity as to the meaning of the representation of the solution given in Eq. (2.28), which is also known as the final form of the GLSEM, without initial conditions. Putting, now, in the obvious notation,

Yt.= fitã + v;., t > 1 (2.32) we have that the first component, y, depends only on the exogenous variables and their lags, and as such is clearly independent of the error component v;.. We also note that the matrix X, may be represented as

X = (P, Y-1 , Y-2 , . . . ,Y-k )+(0,V.), where

V. = (V~1' V~2"'" V~k)' We note, for future reference, that

(2.33) (2.34)

(2.35) The entity G(I)/b(I) is well defined and finite, since unity is not one of the roots of the characteristic equation of the system. It is, further, simple to demonstrate that

COV(V;.')

Loo GTL,G~ = H(O),

T=O

(2.36)

and, moreover, that H(i - j) = H'(j - i) .Thus, we may write,

H(O) H'(l) H'(k - 1)

E(V;.) =H= H(l) H(O) H'(k - 2) (2.37)

H(k - 2) H(k - 3) H'(l)

H(k -1) H(k - 2) H(O)

In the dynamic case, we must also derive the limit of X' X/T ,which is a somewhat laborious exercise. First we note that, by the preceding discussion,

and, thus, we would expect that

X'X X'X *

plim- T = plim - -+H ,

T-+= T-+= T

where

H* = [~ ~].

This is so since

and, for every i ~ 0 , we would expect

(2.38)

---+ 0, ~ H. (2.39)

The proof of these conjecture is quite cumbersome and is, thus, relegated to the appendix of this chapter.

Limiting Distributions for the Dynamic Model

In this section we provide the details of establishing the limiting distribution when the model is dynamic. Returning to the context of Eqs, (2.10) and (2.11), of this chapter, we need to find the limiting distribution of

An alternative notation for ~~. is

~~. = (u~. ®x~.).

The important difference between the nature of the problem in the static and dynamic cases is that in the former, the individual summands, i.e. the vectors ~t. constitute a sequence of independent, though not identically distributed, random vectors, while in the latter (dynamic) case they form a sequence ofdependent vectors. To see this note that from the representation implicit in Eqs. (2.32) and (2.33) we have

- +(0 * * * ) - + *

Xtã =Xtã ,Vt-l.' v t-2., ... ,vt-k. = Xtã wt . (2.40)

so that, for example, ~t. and ~t+l' have v;_l.,V;_2.,ããã, v:-k+l. in com- mon; thus, they cannot be independent, in fact they are not even uncor- related! To examine the issues arising in the dynamic context we intro- duce the probability space, the family of nested sub (J-algebras defined in connection with Eq. (2.5), and we also stipulate that A_j = (0,0) for j = 0,1, ...,k, i.e. we take initial conditions as given and nonstochastic.

Moreover, as we had done in the earlier discussion we convert the problem to one involving thescalar random variables

(T = ~~,\ = L (tT,

t=l

where ,\ is an arbitrary conformable real vector. We note first that (tT is AtT-measurable and that the stochastic sequence {((tT,AtT ) :t::::; T}

is, for each T 2:: 1 a martingale difference, owing to the fact that E((tT) = O. Moreover, it may be shown to obey a Lindeberg condition.

Thus, we have

Lemma 2. In the context of the dynamic GLSEM subject to assumptions (A.l) through (A.5),of Chapter 1, and the preceding discussion, the martingale difference {((tT,AtT) :t::::; T} obeys a Lindeberg condition, i.e, if we put, for arbitrary integer r,

LT = L r 1(tT 12 dP(w 1 At-l,T), then plim LT= O. (2.41)

t=l J1(tTI >~ T--->oo

Proof: We note that

2 1 2 ' 2 ' 2

I(tT I < T I x I II (I09Xt . ) II Iut ã I ,

and we also define, for arbitrary integer, r, A tT l = {w :I(tT I > ~ }

where

Co=rm(1/2) I x I, qt= (Xt.x~.)(1/2).

Since X' X/T converges, defining

2 maxt<TXtãXtã

aT = T

we note, by the results in the appendix to this chapter, that aT ----> O.

Moreover, putting

A tT3= {w : I u~. I > _1_}COaT = {w : I u~. I > _1_}COaT = AT,

we observe that At T l C AtT2 C At T 3 . Consequently, we may write the integral of the Lindeberg condition as

r I(tT 12 dP(w IAt - l T)

J!(tTl >~ ,

< (2.42)

In the preceding the first inequality is the result of the application of the triangle inequality, and of taking outside the integrand entities which are

At-l,T-measurable; the second and third inequalities simply follow because of set inclusions, i.e. AtT l C AtT2 C At T 3 and the fact that the vectors

Ut., t 2': 1 are identically distributed. Thus, we may deduce from Eq.

(2.42) .

(2.43) Since

trMx x <00, we conclude

(the last equality is valid because (l/aT) ~ 00,) and the lemma is proved.

q.e.d.

We are now in a position to prove

Lemma 3. Under the conditions of Lemma 2,

Proof: From Proposition 21, Ch. 5 of Dhrymes (1989), we may show that

(T ----> (d by showing that

LE((;T I At~l,T)!: (J2;:::: O.

t=l

(2.44) Since

it follows that

which demonstrates that for every conformable real vector, '\,

It follows, therefore, from Proposition 34, Ch. 4 of the reference above, that

q.e.d.

We may therefore summarize our discussion in the important

Theorem 1. Consider the GLSEM, of Chapter 1, subject to the assumptions (A.I) through (A.5). Whether the model is static or dynamic all estimators considered this far, i.e. the 2SLS, 3SLS, restricted 2SLS and restricted 3SLS etimators are, asymptotically, of the generic form

VT (8 - (j) ""A ~~,

where

and A is a fixed nonstochastic matrix specific to the particular estimator.

Proof: Lemmata 2 and 3.

Now that we have completed the technical task of establishing the limiting distribution of the 2SLS, 3SLS and all other estimators derived from them, we may summarize the properties of such estimators, beyond their consistency and asymptotic normality.

Theorem 2. Consider the GLSEM, as in Theorem 1. Then the following statements are true:

i. 38L8 are efficient relative to 28L8 estimators, unless a. (Tij = 0 , for i i-j ,or

b. all equations of the system are just identified;

ii. restricted 38L8 estimators are efficient relative to unrestricted 38L8 estimators;

iii. restricted28L8 estimators are not necessarily efficient relative to unrestricted 28L8 estimators.

Proof: Using the results of Eqs. (2.6), (2.7)and (2.8), putting Mx x = RR'

and 8* = (I®R')8 , etc., we can write the covariance matrices of the limiting distribution of the 28L8 and38L8 estimators, respectively, as

The efficiency of 38L8 relative to 28L8 is quite evident. It may be established by exactly the same argument one uses to establish the efficiency of the Aitken vis-a-vis the OL8 estimator when, in the standard GLM, the covariance matrix is not scalar.

To establish part i.a we note that when (Tij =0, i i-j ,the it h diagonal block of C2 is (Tii(8;' 8n-1, which is exactly the it h diagonal block of C3 ã

As for part i.b we note that when all equations of the system are just identified, 8* is a nonsingular matrix, so that we have

C2 = 8*-18*'-18*' q>8* 8*-18*'-1 = 8*-1q>8*'-1 =C3.

To prove parts ii and iii we first begin by noting that Theorem 1 implies that restricted28L8 and38L8 estimators are asymptotically normally distributed with respective covariance matrices

C2R (I - (8*' 8*)-1H' P2H)C2(I - H' P2H(8*'8*)-1),

C3R (I - (8*' q>-1 8*)-1H' P3H)C3(I - H' P3H(8*'q>-1 8*)-1), where H is the matrix of restrictions, which is of full row rank, say r* , and

A simple computation shows that

which completes the proof of part ii.

As for part iii we have that

C2 - C2R = C 2 - [I - (8*' 8*)-1 H' P2H]C2[I - H' P2H(8*' 8*)-1].

Inorder to evaluate it, we note that (8*'8*)-1H' P2H is a nonsymmetric matrix of dimension equal to the number of parameters to be estimated, viz. k = 2:::1(G i+mi) . As for its rank, we note that the nonzero roots of

are exactly those of

which, evidently, consist of r; unities, where r; is the rank of H. Let E be the matrix of characteristic vectors, which is assumed to be nonsingular, and note that we have the representation,

It follows, therefore, that

E- 1(C2-C2R)E'-1 = E- 1C2 E ' - 1 - [he/' ~] E- 1C2E'-1 [ho'T"* ~].

Partitioning,

C~(12)]

C* ,

2(22)

conformably with that of the matrix of characteristic roots, we can rewrite the difference as

It is evident that unless C~(12) = 0, the difference of the two covariance matrices above is indefinite.

q.e.d.

The result in part iii of Theorem 2 confirms the observation made at the time we considered restricted 2SLS, viz. that having restrictions across equationsdestroys the fundamental character of 2SLS as a single equation procedure; as we see now it gains us nothing, in the sense that we cannot prove that the restricted estimator is efficient relative to the unrestricted 2SLS estimator. Thus, its usefulness is questionable. The reader is invited to ask (and answer) the question: what about the case where restrictions do not apply across equations?

Limiting Distributions for Dynamic GLSEM

The GLSEM: Assumptions and Notation

Restricted 2SLS and 3SLS Estimators