Corollary 1. In the context of Theorem 4, the system as a whole is identi- fied by the exclusion and normalization conditions L*o'vec(A*) = vec(H) ,
4.2 The Single Equation LIML Estimator
Here we examine in considerable detail the nature of the LIML estimator when the subsystem of interest consists of a single equation, say the first.
In terms of the (standard) notation of the earlier discussion, we are dealing with
[ (301]
m. ~1, Q,~ _~, ' BF [~.;,], CF [-° 6,] ,En ~"n
(4.41) Ifwe attempt to maximize Eq. (4.40), without imposing the standard nor- malization, we shall not find it possible to estimate, by LIML techniques, all the elements of (3~1. In the end, there will remain one degree of ar- bitrariness; this may be removed by imposing a normalization convention on (3~1.In 2SLS, we did so, ab initio, by the condition bi1 = 1 . In LIML estimation, we are more flexible. Instead of elaborating at this point, we shall wait until the matter arises naturally in the discussion.
Let us return to the development of our argument, and partition M conformably with 000, as shown in Eq. (4.41). Thus,
(4.42)
From the discussion above, note that M n is (m1+1) x (m1+1) . Fur- thermore, partition W conformably with ((30;, 0)' , to obtain
(4.43) In this notation, we have
(4.44) Thus, the concentrated LF, in the special case of the first structural equation, may be written as
(4.45)
where,
mT T
Co= --2-[ln(27r)+1]+ 2(1-In IWI). (4.46) This is now to be maximized with respect to (3°1, 'Y.1 ,and 0"11 .We follow a stepwise maximization procedure. Maximizing, first with respect to 'Y.l , and writing the result in column form, we obtain
(4.47) which implies
'Yã1 =M3i/M31(3°1' (4.48)
It is apparent that, for given (3°1 and 0"11, 1'-1 of Eq. (4.48) globally maximizes Eq. (4.45). Inserting Eq. (4.48) into Eq. (4.45) and maximizing with respect to 0"11, we find
(4.49)
(4.51) which implies
a11 = (3.0;(M11 - M 13M3i/M31)(3~1' (4.50) Since, given (3~1, the (concentrated) LF is concave, the estimator in Eq.
(4.50) corresponds to a global maximum. Now inserting Eqs. (4.50) and (4.48) into Eq. (4.45), we have
L((3°'YX) =C _ ~ _ ~ln ((30;Wil(3°;)
ã1" 0 2 2 (3°'Wã1 11ã1(30 ' where,
Wi1 = M 11 - M 13M3i/ M31. (4.52) To maximize Eq. (4.51), it is simpler to use an alternative to the straight- forward differentiation process. We first note that maximizing Eq. (4.51) is equivalent to minimizing
h((30 )= (30;Wil(3~l. (4.53)
ã1 (30'W (30ã1 11ã1
Remark 1. This is a convenient time to examine how normalization re- quirements intrude in LIML, and how they are handled. In Eq. (4.51) we have the concentrated LF, which is now to be maximized in order to obtain the estimator for (3°1 ,say /3°1 ;inserting the latter in Eqs. (4.48) and (4.50), yields the estimators for 0"11 and 'Y.l ,respectively, and thus completes the LIML estimation process. The difficulty with Eq. (4.51) (or (4.53) for that matter) is that the concentrated LF, as exhibited therein, is homoge- neous of degree zero in the unknown parameter. Thus, the maximizing
232 4. LIML Estimation Methods
value of (3°1 , if it exists, will not be unique. Indeed, if iJ~l is such a value, ciJo1 , where c is any scalar, will be a maximizing value as well. We may eliminate this indeterminacy by imposing a normalization requirement.
Thus, LIML does not escape the need to impose a normalizing convention, but it affords us greater flexibility in dealing with normalizations. While in 2SLS and 3SLS we were required ab initio to set, in the first equation, the first element of (3°1 equal to unity, in the present context this type of normalization is not an integral part of the estimation procedure;
in fact, any other type of normalization will do as well. We may, for exam- ple, require that (3°;W11(3°1 = 1. We shall not pursue such issues at this stage, preferring to continue our argument to its conclusion, without first imposing a normalization convention.
This is also a convenient time for showing that the stepwise maximization procedure, employed above, is fully equivalent to a "simultaneous" max- imization approach. Thus, simultaneously maximizing we obtain, writing the results in column form,
8L
80'11
(:~J (4.54)
From the first and second equations, we easily obtain Eqs. (4.48) and (4.50).
Substituting these values in the third equation above, yields
Since 0'11 > 0, the preceding implies that the LIML estimator of (3°1 is simply one of the characteristic vectors of Wi1 in the metric of W11 . 4 As we shall see momentarily, this is exactly the conclusion we should arrive at, when we proceed to minimize the expression in Eq. (4.53), which will com- plete the demonstration that the step-wise and simultaneous maximization of the LF are equivalent procedures.
Now, let us proceed with the formal aspects of minimizing Eq. (4.53).
Since W11 and Wi1 5 are both positive definite matrices, we can simul-
4For a definition of this term see Dhrymes (1984), p. 73.
5Note that Wn is the second moment matrix of residuals from the regression of
taneously decompose them by
Wll = trr, W{l =pIAP, (4.55)
where P is a nonsingular matrix, and A is the (diagonal) matrix of the characteristic roots of W{l in the metric of Wll ,i.e, its diagonal elements are the roots of
I.AWl l - W{l 1= O. (4.56)
Thus, the ratio in Eq. (4.53) may be written, because of Eq. (4.55), as (4.57) where
Therefore,
ml+l (2
h((3°1) = ~ .Ai2:7=~1 (r
The coefficients of .Ai are positive, and .Ai 2':0 ; thus, we conclude minX, :::;hurl) <maxX..
1 1
(4.58)
(4.59)
(4.60) Since we wish to find theglobal minimum of the expression in Eq. (4.53), we must choose the estimator of (3°1 , say i3~1,by the condition,
(4.61) which means that i3~1 is chosen to be the characteristic vector correspond- ing to the smallest characteristic root of W{l in the metric of W{l . To emphasize this aspect, note that such a characteristic root and vector obey
W{d~~l=.xWll i301ã
Premultiplying by i3~; ,we obtain
(4.62)
(4.63)
on X ,and Wtl is the second moment matrix of residuals from the regression of
yt on Xl .We stress that X is the matrix of observations on all the predeter- mined variables of the entire system, while Xl is the matrix of observations on the predetermined variables actually appearing in, or more precisely not known to be absent from, the first structural equation.
which, indeed, attains themaximum maximorum of the LF. Thus, in con- trast to the FIML procedure we followed in Chapter 3, which only guaran- tees a local maximum, in the LIML procedure we find the ML estimators by locating the global maximum of the LF.
Substituting in Eqs. (4.48) and (4.50), we obtain the LIML estimators of /'.1 and 0"11 as well. Consequently, the LIML procedure with respect to the first structural equation is complete, except for the normalization condition. Since characteristic vectors are unique only up to scalar mul- tiplication we may, at this stage, remove this last ambiguity by imposing any normalization scheme we choose. In this fashion, LIML is shown to be more flexible about normalization conventions than is 2SLS.
We have therefore proved the important
Theorem 1. Consider the GLSEM as in Theorem 10, of Ch. 3. The con- centrated likelihood function, for LIML purposes, relative to the first m*
structural equations is given by
(4.64) where c*= -(mT/2)[In(21f)+1]+(T/2)m* - (T/2)ln I WI, 0:0 consists of the first m* columns of A*, B[ consists of the first m* columns of B* and 2:11 is a submatrix of 2: consisting of the latter's first m* rows and columns. Moreover, the LIML estimator of the parameters of the first structural equation is given by
(4.65) where /3°1 is the characteristic vector corresponding to the smallest char- acteristic root of IAW11 - Wt1 I =0 . Such estimators are uniquely deter- mined, once a normalization condition is imposed on f301 .