Corollary 1. In the context of Proposition 3, suppose that
6.5 Causality and Related Issues
6.5.1 Introduction
When we studied the GLSEM we imposed prior restrictions that served to identify the parameters of the model, and we classified variables as jointly dependent, lagged dependent, and exogenous. We also observed that in order that parameters be estimated consistently the model needed to con- tain variables that, in varying degrees of stringency, were unrelated to the structural error process. In the previous chapters we also introduced vari- ous tests that addressed the validity of the prior restrictions, but we never
questioned the fact that some of the variables contained in the GLSEM were in some sense unrelated to the error process. This section deals with such topics.
A veritable literature has developed around the issues of "causality",
"strict exogeneity", "weak exogeneity" and "predeterminateness". It is quite outside the objectives of this volume to provide a thorough analysis of the issues involved; we shall only touch the highlights, and the reader in- terested in greater detail may consult the excellent review paper by Geweke (1984).
6.5.2 Basic Concepts
In the standard (benchmark) model,
Yt.B* = Xt.C+Ut., t =1,2, ... ,T, the errors are i.i.d. and thus, it is simple to conclude that
even when the system is dynamic, provided we have stability. Consequently, it was natural in the early development of econometrics to define the class of predetermined variables, say xs . , to consist of the lagged dependent and exogenous variables. Such variables have the property that they are at least uncorrelated and, under the i.i.d. assumption, independent of the error terms Ut. , for s <t. As we pointed out in Chapter 1, however, this is not a satisfactory classification scheme, if the specification of the error process is allowed wider scope. In particular, the class of "lagged endogenous" and "exogenous variables" would be of little interest if we changed the probabilistic specification of the error process to, say, strict or even weak stationarity! We are thus led to
Definition 1. Let Zt. = (Yt., Xt.) be the set of variables contained in the GLSEM (or GNLSEM with additive errors); an element therein, say
Ztj , is said to be predetermined at time t with respect to the it h equation, if the sequence {zsj : s < t} is independent of the structural error Uti; it is said to be predetermined at time t with respect to the system, if the sequence {zsj :s <t} is independent of the structural error vector Ut . .A variable is said to be predetermined with respect to the it h equation, or the system, if it is predetermined at time t, for all t, respectively, for the it h equation, or the system.
Remark 14. Notice that by Definition 1, both the class of lagged depen- dent and exogenous variables, as well as Ytq, q < i are predetermined at time t, for all t, with respect to the it h equation, if the GLSEM is simply recursive.
Definition 2. A variable Ztj is said to be strictly exogenous relative to the system above, if and only if the sequences
{Ztj :t=O,±1,±2,oo.}, and {u~. :t=O,±1,±2,oo.},
are mutually independent, in the sense that their joint distribution is the product of their respective marginal distributions.
Remark 15. In the first four chapters of this volume, we usually wrote
Xtã = (Yt-h Yt-2ã, ... , Yt-kã,pd ,and we termed the vector Ptã the vector of exogenous variables. In the terminology of this section this means that the elements of Pt. are strictly exogenous relative to the system above.
Another interesting concept, introduced into the literature by Engle, Hendry and Richard (1983), is that of "weak exogeneity". Unlike the two other con- cepts above which are completely determined by the model and the nature of the variables' probabilistic properties, weak exogeneity depends, in ad- dition, on the investigator's objectives or "loss function". Specifically, we have
Definition 3. In the context of the GLSEM (or the GNLSEM with ad- ditive errors), a set of variables, say zt) which is a subvector of Zt.,
t = 1,2, ...,T , is said to be weakly exogenous if and only if there is a one to one reparameterization, say <p= H(B) , B being the parameter set of the model, such that <p= (<p~, <p~)' ,and the likelihood function can be decomposed as
where z(r) = (z~~)), Z(*) is the complement of z(r) in Z = (zd, <Pl is the parameter set of interest to the investigator, and <P2 is a set of nuisance parameters.
Remark 16. In slightly more familiar terms, the reader may think of Z(*) as Y and of z(r) as X. The decomposition then would read
Of course, there is nothing remarkable about writing the joint distribution of the jointly dependent (Y), and "predetermined" variables (X) as the product of the conditional distribution of the former, given the latter, and the marginal distribution of the latter. The real restriction here is that the parameters of interest appear only in L1 , while the parameters in L2 are of no direct interest.
There are two aspects to the underlying issues: first, what are the (abstract) logical requirements for defining entities which can serve as "instruments"
for the consistent estimation of parameters of the model. The second, and equally important, aspect is: even if we settle on the appropriate set of properties to be possessed by instruments, how can we be assured that our assertion of such properties with respect to certain variables is valid? Itis at this stage that the notion of "causality" becomes relevant.
The term "causality" is grossly inappropriate for what it seeks to describe in econometrics. As usual, the origins of the term lie in other disciplines, in which perhaps it is more suited to the circumstances it seeks to describe.34
The basic notion is this: if in some sense x "causes" Y then we should be better able to "explain" y with x rather than without it. A slightly more formal definition, introduced into the econometrics literature by Granger (1963), (1968), is given below.
Definition 4. Consider the set of variables Zt. = (Yt., Xtã, qt.) and adopt the notation, say Y't-l ={Yt-s. :S~ I} , etc. Suppose we consider the class of best linear one step ahead predictors of the variables in Y and/or the variables in x, and suppose further that the only "relevant" information for that purpose is contained in Zt-l = (Y't-l, X t- 1,Qt-I). Let p(z) ,
p(Y,Q) , p(X,Q) be projection operators into the space spanned by Zt-l, (Y't-l,Qt-I) and (Xt-1,Qt-l), respectively. Then, x is said to cause Y if and only if
Ifthe condition above does not hold then we say that x does not "cause"
Y . The definition is perfectly symmetrical, i.e. the meaning of Y "causes"
x is obtained by simply interchanging the roles of x and Y above.
Remark 17. The connection between this concept and estimation in the context of the GLSEM, or the GNLSEM with additive errors is this: if Yt-s. ,for some s, "causes" Xt., this would most certainly mean that Xtã
cannot be an exogenous variable since this finding would violate the condition that the joint distribution of the exogenous variables be inde- pendent of the (joint distribution of the) error process, Ut .. If, on the other hand, Y does not cause x, we have no evidence that Xt. is cor-
34 The term "causal", meaning that something can be represented in terms of something else either wholly or in part, occurs in the literature of time series.
For example, Brockwell and Davis (1987), p. 83, second edition (1992), define an ARMA(m,n), (i.e, an autoregressive moving average process,) say xô. ,to be
"causal" , or a "causal function" of {Et.} ,if LAjX;-j.= LBiEt-i.,
j=O i=O