CHAPTER 22
The linear regression model IV — departures from the sampling model assumption
One of the most crucial assumptions underlying the linear regression model is the sampling model assumption that y=(1, ¥2 v7) constitutes an independent sample sequentially drawn from D(v,/X,: 0) t= 1, 2 , 7 respectively This assumption enables us to define the likelihood function to be
T
L(8; y)=c(y) || Du: 9) (22.1)
=1
Intuitively, this assumption amounts to postulating that the ordering of the observations in y, and X, plays no role in the statistical analysis of the model That is, in the case where the data on y, and X, are punched
observation by observation a reshuffling of the cards will change none of the
results in Chapter 19 This is a very restrictive assumption for most economic time-series data where some temporal dependence between successive values seems apparent As argued in Chapter 17, for most economic time series the non-random sampling model seems more appropriate
In Section 22.1 we consider the implications of a non-independent sample for the statistical results derived in Chapter 19 It is argued that these implications depend crucially on how non-independence is modelled and two alternative modelling strategies are discussed These strategies give rise to two alternative approaches, the respecification and autocorrelation approaches to the misspecification testing and ways to tackle the dependence in the sample In the context of the autocorrelation approach the dependence is interpreted as due to error temporal correlation On the
other hand, in the context of the misspecification approach the error term’s
Trang 2role as the non-systematic component of the statistical GM is retained and
the dependence in the samples is modelled from first principles in terms of the observable random variables involved Sections 22.2 and 22.3 consider various ways to proceed with a non-random sample and the misspecification testing for the independent sample assumptions respectively In Section 22.4 the discussion of misspecification analysis in Chapters 20-22 is put into perspective
22.1 Implications of a non-random sample
(1) Defining the concept of a non-random sample
It is no exaggeration to say that the sampling model assumption of independence is by far the most crucial assumption underlying the linear regression model As shown below, when this assumption is invalid no estimation, testing or prediction result derived in Chapter 19 is valid in general In order to understand the circumstances under which the independence assumption might be inappropriate in practice, it is
instructive to return to the linear regression model and consider the
reduction from D(Z,,Z> Z; w) to D(y,/X,; 0), t=1,2, , T,in order to understand the role of the assumption in the reduction process and its relation to the NIID assumption for {Z,.ré T} Z,=(y,, X,V
As argued in Chapter 19, the linear regression model could be based directly on the conditional distribution D(¥,/X,; w,) and no need to define D(Z,; yw) arises This was not the approach adopted for a very good reason In practice it is much easier to judge the appropriateness of assumptions related to D(Z,;w), on the basis of the observed data, rather than assumptions related to D(y,/X,; w,) What is more, the nature of the latter distribution is largely determined by that of the former
In Chapter 19 we assumed that {Z,, t¢ T} is a normal, independent and
identically distributed (NIID) stochastic process (see Chapter 8) On the basis of this assumption we were able to build the linear regression model defined by assumptions [1]-[8] In particular the probability and sampling model assumptions can be viewed as consequences of {Z,,t¢ 1} being NUD The normality of D(y,/X,; w,), the linearity of E(y,/X,=x,) and the homoskedasticity of Var(y,/X,=x,) stem directly from the normality of Z, The time invariance of 0=(B, c*) and the independent sample assumptions stem from the identically distributed and independent component of NIID Note that in the present context we distinguish between homoskedasticity
Trang 322.1 Implications of a non-random sample 495 the obvious way to make the independent sample assumption inappropriate is to assume that {Z,,teT} is a (dependent) stochastic process In particular (because we do not want to lose the convenience of normality) we assume that
Z,~ Nim(t), £(t, 0), Cov(Z,Z,)=Lt,s), tise (22.2)
If we return to the money equation estimated in Chapter 19, a cursory look at the realisation of Z,,t=1,2, , T (see Fig 17.1(a)H1d)), would convince us that the above assumption seems much more appropriate than the ITD assumption for such data The realisations of the process exhibit a very distinct time trend (the mean changes systematically over time) and for at least two of them the variance seems to change as well
The question which naturally arises at this stage is to what extent the assumptions underlying the linear regression model will be affected by relaxing the ITD assumption for {Z,.r¢ 7} One obvious change will come in the form of a non-independent sample y =(1,.¥3 \7)' What is not so obvious is the distribution which will replace Diy, X,:w,) Table 22.1 summarises the important steps in the reduction of the linear regression model from the initial assumption that {Z,,1¢ 1) 1s a NIID stochastic process and contrasts these with the case where [|Z,,t¢€ Ï} is a non-IID process The only minor difference between the ‘construction’ of the linear regression model as summarised in this table and that of Chapter 19 is that the mean of Z, is intentionally given a non-zero value because the mean plays an important role in the case of a non-IID stochastic process
As we can see from Table 22.1, the first important difference between the IID and non-IID case is that distribution of Z,, , Z, is complicated considerably in the latter case with the presence of the temporal covariances and the fact that all the parameters are changing with t The presence of the temporal covariance implies that the decomposition of the joint distribution of Z,, , Zin terms of the marginal distributions is no longer valid The dependence among the Z,s implies that the only decomposition possible is the sequential conditioning decomposition (see Chapter 6) where the conditioning is relative to the past history of the process denoted by Z2_, =(Z,, Z,, , Z,_); with t representing the ‘present’ This in turn implies Đụ the role played by D(Z,; ) in the [ID case will now be taken over by D(Z„,/7 ;; ý*@0)) In particular the corresponding decomposition into the conditional (D(y,/X,; w,)) and marginal (D(X,; ,)) distribution will now be
(Z,/Z)_ 13 W() = Dy /Zp-1 Xs WHO) D(X,/ZP 15 WHO), (22.3)
where the past history of the process comes into both distributions because
Trang 4{Z,.teT} Distribution of the sample (1) Normal (2) Independent (3) Identically distributed Z, [m\ [= 0 0 Z, <N m ||: 0 x Q : T m 0 0 x r Decomposition D(Z,,Z;, Zr:)= [[DŒ.:) Probability model Sampling model i=1 T =[] PU/X;: DX: Wp) : và , 2 (i) (4X )~ N(cot+ Bx, 0°) a -1 Cg =m, — Fy 2g 2 621M, O° = 01, — 6,225 đại B=X,3 02), (ii) E(y,/X,=x,) — linear in x, (iii) (iv)
y=(),,.- )7)' is an independent sample sequentially drawn from D(y,/X,;5 W,), t= 1.2 T, respectively Var(y,/X,=x,) — homoskedastic 0=(cạ 8 ø?) - time independcnt (3) Non-identically distributed (1ƒ Normal (2)’ Dependent Z, m(1) \ /XU 1), £1 2) £1 T) Za \_ yf f me) |p 22 0.22.2) x2, 7) Zr mT) / \BŒ:D XT.T) r DỊZ.Z¿ Zr: Ww) = [| Dữ./Z2 ¡: W*(0) rl z =[] DO /Z0 1X WEN) DIX,/Z0_ 1 WHO) tol t-1 tol (Z,/Z?_ )~N (m0 + D fads, tar liox, Jmdo+ Y Tagine, + pel isd @, (0) ) 0,11) | - 1 Ú) (4,/ZP 1X) ~ Nleolt) + Bolts, + © Cady, ¢+ BOX, J ø20) (for the 1 (t) @7 (0 Adi tx, -;] ( int
definition of the parameters involved see Appendix I)
(ii) E(y,/o(¥?.,),X? =x?) - linear in Z?_,
(iii) Var(y,/o(¥°_ ) X°=x,) — homoskedastic (free of Z;_,)
(iv) 06* =(colt), Bolt), BAD x(t), i= 1.2, , t— 1, ø()) — time dependent
y=(),-.-.}'7) is a non-random sample sequentially drawn from
Trang 522.1 Implications of a non-random sample 497 the non-IID case Is rather involved, the underlying argument is the same as
in the IID case The role of D(,/X,; w,) is taken over by D(y;,/Z-_ ,.X,; w*(0)
and the probability and sampling models in the non-IID case need to be defined in terms of the latter conditional distribution A closer look at this distribution, however, reveals the following: t-1 (i) E(y,/o(¥p_ 1) XP =Xp) = Colt) + Bolt)X, + 2 Laitt)y,—1 + BOX, ~]; (22.4) (ii) Var(y,/o(¥?_ 1), Xp = Xp) = 96(0); (22.5) (ii) OF =(colt), Bolt), Blt), a(t), i= 1,2, t— 1, o2(t)), (22.6) where
Firstly, although the conditional mean is linear in the conditioning variables, the number of variables increases with t Secondly, even though the conditional variance is homoskedastic (free of the conditioning variables) it is time dependent along with other parameters of the distribution This renders the conditional distribution D(y,/Z?_ ,.X,; w*(0), as defined above, non-operational as a basis of an alternative specification to be contrasted with the linear regression model It is in a sense much too general to be operational in practice Theoretically, however, we can define the new sampling model as:
y=(ÿ;¡, Vy)’ represents a non-random sample sequentially drawn, from D(¥,/Z7_, X,: w(t), t= 1, 2, , T, respectively In view of this it is clear that what we need to do is to restrict the generality of the underlying assumptions related to {Z,,t€7} in order to render D(y,/Ze_,, X,; w(t) operational In particular, we need to impose certain
restrictions on the form of the time-heterogeneity and dependence of the
stochastic process {Z,,teT} This is the purpose of the next two sub-
sections In the next sub-section the dependence of {Z,, te T} is restricted to
asymptotic independence and the complete time-heterogeneity (non-
identically distributed) is restricted to stationarity (see Chapter 8) in order
Trang 6(2) The respecification approach
As seen above, when {Z,, te 1} is assumed to be normal, and non-IID, the conditional distribution we are interested in, D(y,/Z°_,, X,; w*(2)), has a mean whose number of terms increases with f, and its parameters w#(t) are time dependent (see (4)-(6)) The question which naturally arises at this stage is to what extent we need to restrict the non-IID assumption in order to ‘solve’ the incidental parameters problem In order to understand the role of the dependence and non-identically distributed assumptions let us consider restricting the latter first
Let us restrict the non-identically distributed assumption to that of
stationarity, i.e assume that {Z,,t€T} is a normal, stationary stochastic
process (without any restrictions on the form of dependence) Stationarity (see Section 8.2) implies that E(Z,) =m, Cov(Z,Z,)=X(\t—s|), tse T (22.7) L€ Z4 m Yoh, Ley XX Lp, zz| vịm : *; Zr m Ly | BI YeU(t—s), Ji—s|=i i=l.2, ,T—1, (22.8)
That ¡s, the Z,s have identical means and variances and their temporal covariances depend only on the absolute value of the distance between them This reduces the above sample [Tx(k+l)]x[T x(k+ Đ] covariance matrix to a block Toeplitz matrix (see Akaike (1974)) This restricts the original covariance matrix considerably by inducing symmetry and reducing the number of ‘different’ (k + 1) x (k + 1) matrices making up these covariances from T? to T— 1 A closer look at stationarity reveals that it is a direct extension of the identically distributed assumption to the case of a dependent sequence of random variables In terms of observed data the sample realisation of a stationary process exemplifies no systematic changes either in mean or variance and any t-period section of the realisation should look like any other t-period section That is, if we slide a t-period ‘window’ over the realisation along the time axis the ‘picture’ should not differ systematically Examples of such realisations are given in Figs 21.2 and 21.3
Trang 722.1 Implications of a non-random sample 499
as D(y,/Z°_,, X,; w3(t)) is concerned (4)(6) take the form:
t-1 t-1
( EU,/o(YP ¡),X=xi)=co +ex,+ )_ xơ ¡+ ), X/.ụ (22.9)
i=l i=1
(H) Var(y„/ø(Y- ¡), X? = x?) =ớã: (22.10)
(ii) O* =(€o, Bos Bi % i= 1,2, ,t— 1, 02) (22.11)
As we can see, stationarity enables us to ‘solve’ the parameter time- dependence problem but the incidental parameters problem remains largely unresolved because the number of terms (and unknown parameters) in the conditional mean (9) increases with t Hence, the time-homogeneity introduced by stationarity is not sufficient for an operational model We
also need to restrict the form of dependence of {Z,,r¢ 17} In particular we
need to restrict the ‘memory’ of the process by imposing restrictions such as ergodicity, mixing or asymptotic independence In the present case the most convenient memory restriction is that of asymptotic independence This restricts the conditional memory of the normal process so as to enable us to approximate the conditional mean of the process using an mth-order
Markov process (see Chapter 8) A stochastic vector process {Z,,téT} is said to be mth-order Markov if
E(Z,/Zo_ ,)= E(Z,/o(Z,_,,Z,—9, .,Z,-m)), t>m (22.12) Assuming that {Z,,te 1} is:
(i) normal; (1) stationary; and
(i1) asymptotically independent
enables us to deduce that for large enough m (hopefully m<T) the conditional mean takes the form
nề = Ey o(¥?_ 1) XP =x?) =o + Box, + > %iY,~¡-Ð » Bix, -;-
i=l i=l
(22.13)
This form provides us with an operational form for the systematic component for t>m Now 0* =(co, Bo, B;, %;,i= 1.2 ,m, 2) is both time invariant and its dimensionality remains fixed as T increases Indeed,
D(y,/Z?_,, X,; w¥) yields a mean linear in x,, ¥,_;, X,-;,i=1,2, ,m,a
homoskedastic variance and w# is time invariant Hence, defining the non- systematic component by
Trang 8we can define a new statistical GM based on D(1,/Z?_,, X,; w¥%) to be Y= Box, + » (1;1,—¡ + iX;-¿) TH t>m (22.15)
;=Ị
In matrix form for the sample period r=m+1, , T,it can be expressed as
y=XBạ+Z*y+u? (22.16)
in an obvious notation Note that cy has been dropped for notational
convenience (implicitly included in B,)
It must be noted at this stage that the assumption of stationarity and
asymptotic independence for (Z,,te7T} are not the least restrictive
assumptions for the results which follow For example asymptotic independence can be weakened to ergodicity (see Section 8.3) without affecting any of the asymptotic results which follow Moreover, by ‘strengthening’ the memory restriction to that of @-mixing some time- heterogeneity might be allowed without affecting the asymptotic results (see
White (1984))
It is also important to note that the maximum lag m in (15) does not represent the maximum memory lag of the process {y,/Z?_,,X,,f¢€ 7} asin the case of an m-dependence (see Section 8.3) Although there is a duality result relating an m-dependent with an mth-order Markov process, in the case of the latter the memory is considerably longer than m (see Chapter 8) This is one of the reasons why the AR representation is preferred in the present context The maximum memory lag is determined by the solution of the lag polynomial:
z1)=[ 1 ¬» z)=0 (22.17)
¿=1
In view of the statistical GM (15) let us consider the implications of the non-random sample (i)-(iii) for the estimation, testing and prediction
results in Chapter 19 Starting with estimation we can deduce that for
ÿ=(XX)”!Xy (22.18)
and
s?=(1/T—k)ữà (22.19)
the following results hold:
(i) E(B) = By +(X'X)~ 'X'E(Z*y)# B: B is a biased estimator of B;
^ P na,
(ii) Be 8B; Bis an inconsistent estimator of B;
(iii) MSE(p) =z1(XX) !+(XX) !XE(Z*+y7*)XXX) !#ø?(XX) 1;
Trang 922.1 Implications of a non-random sample 501
(v) s? #67: s? is an inconsistent estimator of o:
P
(vi) sXX)_! MSF():
where M,=l;-X(XX) 'X/ These results show cleary that the implications of the non-random sample assumption are very serious for the appropriateness of 6=(, s) as estimators of 0=(B, 0’) Moreover, (i}(vi) taken together imply that none of the testing or prediction results derived in Chapter 19, under the assumption of an independent sample, are valid In particular the t-statistics for the significance of the coefficients in the estimated money equation are invalid together with the tests for the coefficients of y, and p, being equal to one as well as the prediction intervals At first sight the argument underlying the derivation of (i){vi) seems to be identical to the ‘omitted variables problem’ criticised in Section 20.2 as being rather uninteresting in that context A direct comparison, however, between (15) and ¥,= BX, +u, tel (22.20) reveals that both statistical GM’s are special cases of the general statistical GM y,=Ely;,/o(¥°_,), X°=x°) +e, (22.21)
under alternative assumptions on {Z,,t¢ 1} In this sense (20) and (15) constitute ‘reductions’ from the same joint distribution
D(Z, Z„:) (22.22)
which makes them directly comparable: they are based on the same conditioning set
⁄,=1ø(Y? ,).XP=xP! Ay yy: (22.23)
(3) The autocorrelation approach
The main difference between the respecification approach considered above and the autocorrelation approach lies with the systematic component In the respecification approach we need to respecify the systematic component in order to take account of (model) the temporal systematic information from the sample In the autocorrelation approach the systematic component remains the same and hence the temporal systematic information will be relegated to the error term which will no
Trang 10the error terms This is contrary to the logic of the approach of statistical model specification propounded in the present book (see Chapter 17) The approach, however, is important for various reasons Firstly, the comparison with the respecification approach is very illuminating for both approaches Secondly, the autocorrelation approach dominates the textbook econometric literature and as a consequence it provides the basis for most misspecification tests of the independent sample assumption
The systematic component for the autocorrelation approach is exactly the same as the one under the independent sample assumption That is, assuming a certain form of temporal dependence (see (41)) H, = E(y,/Z,) = B'x, (22.24) This implies that the temporal dependence in the sample will be left in the error term: & = ¥,—E(y,/G) (22.25) G, as defined in (23) In view of this the error term will satisfy the following properties: (i) E(e,/Z,) =0 (22.26) i) Else/2)=42 0 F73 (22.27) 0) bts A) = u(t,s), t#s
These assumptions in terms of the observable r.v.’s are often expressed in the rather misleading notation:
(y/X)~ N(XB,02V,), Vr>0 (22.28)
The question which naturally arises at this stage is, ‘what are the implications of this formulation for the results related to the linear
regression model derived in Chapter 19 under the independence assumption” As far as the estimation results related to B=(X’X)~'X’y and s?=[1/(T— k)]ãâ, = y — XB are concerned we can show that:
(ý EIỆ) = B ƒ is an unbiased estimator of ÿ;
Trang 1122.2 Tackling temporal dependence 503 In view of (iii)’, (v)’, (vi)’ and (vii)’ we can conclude that the testing results derived in Chapter 19 are also invalid
The important difference between the results (iH vi) and (i)’-(vi)' is that B is not such a ‘bad’ estimator in the latter case This is not surprising, however, given that we retained the systematic component of the linear regression model On the other hand, the results based on Cow(y/2# = X) = o°I, are inappropriate The only undesirable property of B in the present context is said to be its inefficiency relative to the proper MLE of B when V; is assumed known That is, B is said to be an inefficient estimator relative to the GLS estimator
Ê=(XY7!X) !XV;ly (22.29)
(see Judge et al (1985)) A very similar situation was encountered in the case of heteroskedasticity and the same comment applies here as well This efficiency comparison is largely irrelevant In order to be able to make justifiable efficiency comparisons we should be able to compare (B 67) with estimators based on the same information set It is well known, however, that in the case where V; is unknown no consistent estimator of the parameters of interest exist and the information matrix could not be used to define a full efficiency lower bound
22.2 Tackling temporal dependence
The question: ‘How do we proceed when the independent sample assumption is invalid?’ will be considered before the testing of this assumption because in the autocorrelation approach the two are inextricably bound up and the testing becomes easier to understand when the above question is considered first This is because most misspecification tests of sample independence in the autocorrelation approach consider particular forms of departure from the independence assumption which we will discuss in the present section This approach, however, will be considered after the respecification approach because, as mentioned above, the former is a special case of the latter Moreover, the respecification approach provides a most illuminating framework in the context of which the autocorrelation approach can be thoroughly discussed
(1) The respecification approach
In Section 22.1 we considered the question of respecifying the components of the linear regression model in view of the dependence in the sampling model It was argued that in the case where {Z,.t¢ 1} is assumed to be a
Trang 12component takes the form
= E(y,/o(¥?_ 1), X? =x°) = Box, + x (a:¥)-,+ Bix,_,), t>m
- (22.30)
The non-systematic component is defined by
u=V,— Ely,/o(¥?_ 1), XP=xp), t>m (22.31)
This suggests that {u,,t>m} defines a martingale difference process relative
to the sequence Y=o(Z?, X,,,), t>m, of a-fields (see Chapter 8) That is,
Elu,/Ø, _¡)=0, t>m (22.32)
Moreover,
Fuu,)=1°" 1 Melts) = 0, ¢t>s (22.33 33)
i.e it is an innovation process These properties will play a very important role in the statistical analysis of the implied statistical GM:
Bos + 3 Ave 4y Bix,-;+u,, t>m (22.34)
i=1
If we compare (34) with the statistical GM of the linear regression model we can see that the error term, say ¢,, in
,=X,+e, tet, (22.35)
is no longer white noise relative to the information set (o(¥°_,), X°=x°) given that ¢, is largely predictable from this information set (see Granger (1980)) Moreover, in view of the behaviour of ¢, the recursive estimator
ñ.=Ê,-¡ +(XPX?)” 'x(y,— Bh x,) (22.36)
he Chapter 21) might exemplify parameter time dependence given that
~ B_,x,), t>k, is no longer a mean innovation process but varies
1 nteratonly with t Hence, detecting parameter dependence using the recursive estimator ù t>k, should be interpreted with caution when
invalid conditioning might be the cause of the erratic behaviour of
E(B,—B,-1) t>k (22.37)
The probability distribution underlying (34) comes in the form of D(y,/Zp_1,X,5 W,), which is related to the original sequential decomposition
Trang 1322.2 Tackling temporal dependence 505
D(Z,/Zp-43 Wh = Dy,/ Ze 4 Xs Wy) D(X /Zp_ 4; 2): (22.39)
The parameters of interest 0=(B;, o;.,.i=0, 1, ,m, o*) are functions of w, and X, remains weakly exogenous with respect to 6 because w, and ; are variation free (see Chapter 19) Hence, the estimation of @can be based on
on Dif 1, %,5 #,) only For prediction purposes, however, the fact that (X,/Zp_ ,; Wz) involves YP_, cannot be ignored because of the feedback between these and X, Hence, for prediction purposes in order to be able to concentrate exclusively on ee X,; W,) we need to assume that
(X,/Z? _` ;lJ2)= (X,/X? ~19 Ứ;) (22.40)
Le yo does not Granger cause X, (see Engle et al (1983)) When weak exogeneity is supplemented with Granger non-causality we say that X, is
strongly exogenous with respect to @
The above changes to the linear regression model due to the non-random sample taken together amount to specifying a new statistical model! which we call the dynamic linear regression model Because of its importance in econometric modelling the specification, estimation testing and prediction in the context of the dynamic linear regression model will not be considered here but in a separate chapter (see Chapter 23)
(2) The autocorrelation approach
As argued above in the case of the autocorrelation approach the stochastic process (Z,,te 1} is restricted even further than just stationary and
asymptotic independent In order to ensure (24) we need to restrict the
temporal dependence among the components of Z, to be ‘largely’ identical, Le
Cov(Z„Z4)=ø,tllt—sl, Lj=l2 k+l, pnseT (2240)
(see Spanos (19854) for further đetails)
The question of restricting the time-heterogeneity and memory of the process arises in the context of the autocorrelation approach as restrictions
on o(t) and vit, s) (see (27)) Indeed, looking at (28) we can see that c(t) has already been restricted (implicitly), ¢7()=07, te T
Assuming that {¢,, f2 1} is a stationary process V; becomes a Toeplitz matrix (see Durbin (1960)) of the form Us= Uys OS= 1, 2, , T and
although the number of unknown parameters is reduced the incidental
parameters problem is not entirely solved unless some restrictions on the ‘memory’ of the process, such as ergodicity or mixing, are imposed Hence, the same sort of restrictions as in the respecification approach are needed
Trang 14generating mechanism for ¢, which ensures both stationarity as well as some form of asymptotic independence (see Chapter 8) This mechanism (or model) is postulated to complement the statistical GM of the linear regression model under the independence assumption in order to take account of the non-independence The most commonly used models for ¢, are the AR(m), MA(m) and ARMA(p, 4) models discussed in Chapter 8 By far the most widely used autocorrelated error mechanism is the AR({1) model where ¢,= pé,_, +u, and u, is a white-noise process Taking this as a typical example, the statistical GM under the non-random sample assumption in the context of the autocorrelation approach becomes
v= BX, +6, (22.42)
& = pe, +u, O<p<l tet (22.43) The effect of postulating (43) in order to supplement (42) is to reduce the
number of unknown parameters of V; to just one, p, which is time invariant
as well The temporal covariance matrix V, takes the form 1 p prove pt} p 1 pr 1 V+(p) = (j=) p? + : (22.44) p1 pr? " I
(see Chapter 8) On the assumption that (43) represents an appropriate model for the dependency in the sample we can proceed to estimate the
parameters of interest 0=(B, p.a*) where a? = Ele?)
The likelihood function based on the joint distribution under (42)-{43) is
L(0; y)= (2207) “7 (det Vip)?
x exp = 525 (y—X#/Y„(ø)ˆ '(y ~xpi} (22.45)
Trang 1522.2 Tackling temporal dependence 507 ClogL 1 zp 7a? X'V¿(p) !¿=0, 2247) Clog L ae? —= _ T 1, 1e=(), (22.48) GlogL —p ếp =p3” = ax —ps_j)e, =0, — (2249)
where e=y— Xổ Unfortunately, the first-order conditions (ê log L)/2Ø=0 cannot be solved explicitly for 6 the MLE of 6 because they are not linear in 0 To derive 6 we need to use some numerical optimisation procedure (see Beach and McKinnon (1978), Harvey (1981), inter alia) For a more extensive discussion of the estimation and testing in the context of the autocorrelation approach, see Judge et al (1985)
(3) The two approaches compared — the common factor restrictions As mentioned in Section 22.1, the autocorrelation approach 1s a special case of the respecification approach In order to show this let us consider the example of the statistical GM in the autocorrelation approach when the error mechanism is postulated to be an AR(m) process y= Py&—1 +P 28-2 FF Perm TU (22.50) When this is substituted in y,= p’x, +e, the resulting hybrid statistical GM is y= BX, + Y pili - BX) + (22.51) i=1
If we compare this with the statistical GM (34) in the respecification approach we can see that the two are closely related Indeed, (34) is identical to (51) under
Ho: Bp;= —B; i=1,2, ,m (22.52)
These are called common factor restrictions (see Sargan (1964), Hendry and Mizon (1978), Sargan (1980)) This implies that in this case the model suggested by the autocorrelation approach is a special case of (34) with the common factors imposed a priori The natural question which arises at this stage is whether this is a general result or is only true for this particular example In Spanos (1985a) it was shown that in the case of a temporal covariance matrix V, based on a stationary ergodic process the hybrid
Trang 16general form
= px,+ Sa, AV, —-;— BX, ) Tụ (22.53)
i=]
Hence, the autocorrelation approach can be viewed as a special case of the respecification approach with the common factors restrictions imposed a priori
Having established this relationship between the two approaches it is interesting to consider the question: ‘Why do the common factors restrictions arise naturally in the autocorrelation approach? The answer lies with the way the two approaches take account of the temporal dependence in the formulation of the non-random sample assumption
In the respecification approach based on the statistical model specification procedure proposed in Chapter 17 any systematic information in the statistical model should be incorporated directly into the definition of the systematic component The error term has to be non-systematic relative to the information set of the statistical model and represents the ‘unmodelled’ part of y, Hence, the systematic component needs to be redefined when temporal systematic information Is present in the sampling model On the other hand, the autocorrelation approach attributes the dependence in the sample to the covariance of the error terms retaining the same systematic component The common factors arise because by modelling the error term the implicit conditioning is of the form E(e,/o(Ee_,)) = Š dự Tp (22.54) i=1 where E?_, =(é,_ 1 2, +» €9) The implied statistical GM is €, = E(e,/o(Ep_,)) +, (22.55) or ¥,-Bx,= ¥ ay, — BX, +, (22.56) i=1
which is identical to ($3) Hence, the common factors are the result of
‘modelling’ the dependence in the sample in terms of the error term and not
Trang 1722.2 Tackling temporal dependence 509
‘Z,.,te 1! based on the following sequential conditional distribution:
(i) ayfi) \f i Œịiy Œ12
2,720 -X (et i) M 1 )) 22.57
ỏ x aril) A¿¿0/\X,-/V Nai Q22
As shown in the appendix, the parameters of the statistical GM (4) are related to the above parameters via the following equations:
đo=©;z¿¡ B, = (ay fi) +.@, Q57 Addl), (22.58) and
z;,=laii()+@,;Ô227a;4(0)} for i=1,2, ,m
Hence the common factor restrictions hold when
a¡;()=a;;()=0 and A,,(i)=a,,()I for alli=l, ,m (22.59) That is the common factor restrictions hold when Granger non-causality holds among all Z,,s and an identical form of temporal self-dependence exists for all Z,,s, i=1, , m, t>m (see Spanos (1985a)) These are very unrealistic restrictions to impose a priori In principle the common factor restrictions can be tested indirectly by testing (59) in the context of the general AR(m) representation
A direct test for these restrictions can be formulated as a specification test in the context of the respecification approach In order to illustrate how the common factor restrictions can be tested let us return to the money equation estimated in Chapter 19 and consider the case where m= I The statistical GM of the money equation for the respecification and autocorrelation approaches is
m,= By + Brv,+ BaPp~ Balm 4p + %2d 1 F LaPr- 1 +¿l_¡ TH,
(22.60) and
m, = BY ~ B%y, + Ep, + Bt Ae 6 =#i6-v +, Ji|<L 2261)
Trang 18we can see that the two sides of (62) have the common factor (1 —a,L) which can be eliminated by dividing both sides by the common factor This will give rise to (61) The null hypothesis H 1s tested against #2 %3 ha Hyia,4-— or 4,4%-— or a,4-— Bs Bs Ba
Although the Wald test procedure (see Chapter 16) is theoretically much more attractive, given that estimation under H, 1s considerably easier (see Mizon (1977), Sargan (1980) on the Wald tests), in our example the likelihood ratio test is more convenient because most computer packages provide the log likelihood The rejection region based on the asymptotic likelihood ratio test statistic (see Chapter 16), Ho —2 log, Aly) =2 (log, L(@; y) log, L(& y)} ~ x2kK—1), — (22463) x where 6 and 6 refer to the MLE’s of @ under H, and Hp, respectively, takes the form
C,=ty: —2 log, Ay) >c,} =| dy7(k — 1) (22.64)
Estimation of (60) for the period 1963ii—1982iv yielded m,= — 0.766 +0.793m,_ , + 0.038}, 4+ 0.2405, _ , +0.023p, (0.582) (0.060) (0.169) (0.182) (0.208) +0.160p,_¡—0.041i,+0.006i, ¡+ ñ,, (0.220) (0012) (0013) (0018) R?=0.999, R?=0.999, s=0.0181, log L=209.25, T=79 (22.65) Estimation of (61) for the same period yielded m, = 4.196 +0.56 Ly, +0.884p, —0.040i, +4, %=0.8192,_, +4, (1.53) (0.158) (0037) (0.013) (0.064) — (0.022) R*=0.998, R?=0.998, s=0.0223, log L=187.73, T=79 (22.66)
Hence, —2 log, A(y)= 43.04 Given that c,=7.815 for «=0.05 and three degrees of freedom, H, is strongly rejected
Trang 1922.3 Testing the independent sample assumption 511 depends on the appropriateness of the statistical GM postulated for the general model as part of a well-defined estimated statistical model The question of ensuring this is extensively discussed in the next chapter
22.3 Testing the independent sample assumption ( The respecification approach
In Section 22.1 it was argued that the statistical results related to the linear regression model (see Chapter 19) under the independent sample assumptions are invalidated by the non-independence of the sample For this reason it is of paramount importance to be able to test for independence
As argued in Section 22.2, the statistical GM takes different forms when the sampling model is independent or asymptotically independent, that is,
\W,=fX+u, t=1,2, ,T (22.67)
and
y= Box, + Y (yt BX) tu t= 1,2, ,T7 (22.68)
i=l
respectively This implies that a test of independence can be constructed based on the significance of the parameters a=(a1, , %m)'s BX =(Bi, Bo
ees Bry, Le
H,:a=0 and p*=0 against
H,:zz0 or p*+0
In view of the linearity of the restrictions an F-test type procedure (see Chapters 19 and 20) should provide us with a reasonable test The discussion of testing linear hypotheses in Chapter 20 suggests the statistic:
— RRSS—URSS (T—ktm+ 1)
n=— Ugs | mk mk J (22.69) 22.69
mkz(y) has an asymptotic chi-square distribution (y7(mk)) under Ho In small samples, however, it might be preferable to use the F distribution approximation (F(mk, T—k(m + 1))) even though it is only asymptotically
Trang 20justifiable This is because the statistic t*(y)=mkt(y) increases with the number of regressors; a feature which is particularly problematical in the present case because of the choice of m On the other hand, the test statistic (69) does not necessarily increase when m increases In practice we use the test based on the rejection region C, ={y: tly) >c,!, where c, is defined by
a= | _ dF(mk, T—kWm + 1) (22.70)
cx
‘as if? it were a finite sample test
Let us consider this test for the money equation estimated in Chapter 19 with m=4 using the estimation period 1964i-1982iv, m, = 2.763 +0.705y, + 0.862p, —0.053i, + ñ,, (22.71) (1.10) (0.112) (0022) (0.014) (0.040) R?=0.995, R?=0.995, s=0.040 22, log L= 138.425, RSS=0.1165, T=76, m, = 0.706 +.0.589m, _ , —0.018m, —0.046m, 5 +0.214m, 5 (0.815) (0.132) (0.152) (0.166) (0.129) +0.191y,+0.518y,_¡=0.2531,_;—0.116y,_ y—0.0221,_„ (0.199) (0261) (0255 (0260) (0223) ~0.060p,+0.606p,_¡ —0.381p,_ ;+0.558p,_ y —0.479p,_„ (0348) (0670) (0642) (0630 (0.4348) —0.047i,+0.017i,_¡ —0.025i,_ › +0.006i,_ y —0.018i,_„ (0.014) (0.020) (0022) (0021) (0014) +h, (22.72) (0.018) R?=0,999, R?=0.999, s=0.01778, log L=210.033, RSS=0.017697, T=76
The test statistic (69) takes the value t(y)= 19.25 Given that c,= 1.812 for a=0.05 and degrees of freedom (16, 56) we can deduce that H, is strongly
rejected That is, the independence assumption is invalid This confirms our initial reaction to the time path of the residuals ñ,= y, — B’x, (see Fig 19.3)
that some time dependency was apparently present
An asymptotically equivalent test to the F-test considered above which corresponds to the Lagrange multiplier (see Chapters 16 and 20) test can be based on the statistic
Hy
Trang 2122.3 Testing the independent sample assumption 313 where the R* comes from the auxiliary regression
ti, = (Bo — BY, + YO + BM +e f=mt1 ,T (22.74)
i=l
In the case of the money equation the R? for this auxiliary regression is 0.848 which implies that LM(y)=64.45 Given that c,= 26.926 for «=0.05 and 16 degrees of freedom, hy is again strongly rejected
It is interesting to note that LM(y) can be expressed in the form:
RRSS — =)
"— a RRSS (22.75)
(see exercise 4) This form of the test statistic suggests that the test suffers from the same problem as the one based on the statistic t*(1), given that R? increases with the number of regressors (see Section 19.4)
(2) The autocorrelation approach
As argued in Section 22.3 above, tackling the non-independence of the sample in the context of the autocorrelation approach before testing the appropriateness of the implied common factors is not the correct strategy to adopt In testing the common factors, implied by adopting an error autocorrelation formulation, however, we need to refer back to the respecification approach Hence, the question arises, ‘how useful is a test of the independence assumption in the context of the autocorrelation approach given that the test is based on an assumption which is likely to be erroneous” In order to answer this question it is instructive to compare the statistical GM’s of the two approaches:
y,= Box, + È, x,-¡t È, Xu, T>m, (22.76)
¡=1 i=1
and
v,=BxX, +6, a(Le,=b(L)u, (22.77)
where a(L) and b(L) are pth- and gth-order polynomials in L That is, the postulated model for the error term is an ARMA(p,q) time series formulation (see Section 8.4)
The error term ¢, interpreted in the context of (76) takes the form
&,=(Bo—BYX, + Y Lei -i+ BX, J+ t>m (22.78)
i=l]
That is, ¢, is a linear function of the normal, stationary and asymptotically
Trang 22defined in (78) is itself a normal, stationary and asymptotically independent process Such a process, however, can always be approximated to any degree of approximation by an ARMA(p, q) stationary process with ‘large enough’ p and q (see Hannan (1970), Rosanov (1967), inter alia) In view of this we can see that testing for departures from the independent sample assumption in the context of the autocorrelation approach is not unreasonable What could be very misleading is to interpret the rejection of the independence assumption as an ‘endorsement’ of the error autocorrelation model postulated by the misspecification test Such an interpretation is totally unwarranted
An interesting feature of the comparison between (76) and (77) is that of the role of the coefficients of x,, By and B, are equal only when the common factor restrictions are valid In such a case estimation of B in y,= f’x, +¢, should yield the same estimate as the estimate of f in
y= Px, +8, a(L)e= b(L)u, (22.79)
Hence, a crude ‘test’ of the appropriateness of the implicitly imposed common factor restrictions might be to compare the two estimates of B in the context of the autocorrelation approach Such a comparison might be quite useful in cases where one is reading somebody else’s published work and there is no possibility of testing the common factor restrictions directly In view of the above discussion we can conclude that tests of the sample independence assumption in the context of the autocorrelation approach do havea role to play in misspecification testing related to the linear regression model, in so far as they indicate that the residuals é,, t=1, , T, do not constitute a realisation of a white-noise process For this reason we are going to consider some of these tests in the light of the above discussion In particular the emphasis will be placed on the non-parametric aspects of these tests That is, the particular form of the error autocorrelation (AR(1), MA(1), ARMA(), g)) will be less crucial in the discussion In a certain sense these autocorrelation based tests will be viewed in the context of the auxiliary regression
m
i,=6%,+ Y pit,_;+v,, t>m (22.80)
¡=1
which constitutes a special case of (74)
Trang 2322.3 Testing the independent sample assumption 515 proceed in order to construct tests for such a hypothesis is to choose a ‘measure’ of temporal association among the ¢,s and devise a procedure to determine whether the estimated temporal associations are significant or
not
The most obvious measure of temporal dependence is the correlation
between ¢, and ¢,,,, what we call ith-order autocorrelation defined by
Cov(é,é; +1)
yea, (22.8 1)
[Var(e,) Var(e, ,)]?
In the case where {¢,, t¢ T} is also assumed to be stationary (Var(é,)= Var(e, ,,)) this becomes r _ Covlé,é, +1) = ¿ I>1 (22.82) Var(e,) Note that —1<r,<1 The natural estimator of r, in the present context is T if T 5= » sá„) ( a) I> 1 (22.83) t=l+1 i \t=i
Intuition suggests that a test for the sample independence assumption in the present context should consider whether the values of f,, for some /= 1, 2, ,m, Say, are significantly different from zero In the next subsection we
consider tests based on /=1 and then generalise the results to /=m> 1
The Durbin—Watson test (l= 1)
The most widely used (and misused) misspecification test for the independence assumption in the context of the autocorrelation approach is the so-called Durbin-Watson test The postulated statistical GM for the purposes of this test is
V,= BX, +8 £,=08_—i tủ, |p|< 1 teT (22.84)
The null hypothesis of interest is Hy: p =0 (ie ¢,=u,, White noise) against
Trang 24where 1 -1 0 0 —1 2 —I 0 0 -—I 2 -I 0 A= | "m (22.86)
The relationship between A, and V; is given by
Vip) ' =U —pyl+ pA, +pU— pic, (22.87)
where C =diag(1,0, 0, 1) Durbin and Watson used the approximation
Vip) '=(—p)?l+pA, (22.88)
which enabled them to use Anderson’s result based on a known temporal covariance matrix As we can see from (88), when p =0, V;(o)~1=I, The rejection region for the null hypothesis
Hy: p=0, against H,: p40 takes the form
Cp= Vi re}, (22.89)
where c, refers to the critical value for a size « test, determined by the distribution of the test statistic (85) under Hy This distribution, however, is inextricably bound up with the observed data matrix X given that £= Me,
M, =I, —X(X’X)~ 1X’ (see Chapter 19) and
(22.90)
Trang 2522.3 Testing the independent sample assumption 517 where v1 ¥2, , ¥7_, are the non-zero eigenvalues of M,A, and
&€=H’e~ N(O, 07/7) (22.92)
Hence the value c, can be evaluated by ‘solving’ the following probabilistic statement based on (91) for c,:
1—k
Pr(tyly) <c,) = pr( 3, ee? <o) % (22.93)
i=l for a given size «
Hannan (1970), however, suggested that in practice there was no need to evaluate c, Instead we could evaluate
T-k
_: 3 tt,—t;0w)È7 <0) (22.94)
í=1
and if this probability exceeds x we accept Hy, otherwise we reject Hạn
favour of H/:p>0 For Hy: p<0Oas the alternative 4 —1,(y) should be used
in place of t,(y) The value 4 stems from the fact that
t,(y)=2(1—f,) where | = 2+ —— (22.95)
is the first-order residual correlation coefficient and takes values between —1 and 1; hence 0<1,(y) <4
In the case of the estimated money equation discussed above the Durbin— Watson test statistic takes the value
r¡(y)=0.376 (22.96)
For this value p(y) =0.000 and hence Hy: p=0 is strongly rejected for any
size « test The question which arises 1s, ‘how do we interpret the rejection of
Hy in this case” "Can we use it as an indication that the appropriate statistical GM 1s not \,= Bx, + u, but (84)” The answer 1s, certainly not In no circumstances should we interpret the rejection of H, against some specific form of departure from independence as a confirmation of the validity of the alternative in the context of the autocorrelation approach This is because in a misspecification testing framework rejection of Ho should never be interpreted as equivalent to acceptance of H,; see Davidson and McKinnon (1985)
In the case where the evaluation of (93) is not possible, Durbin and
Watson (1950, 1951) using the eigenvalues of A,, proposed a bounds test
Trang 26they derived d, and dy such that for any X,
d, <t,(y) <du, (22.97)
where d, and dy are independent of X This led them to propose the bounds test for
Hy: p=0 against Hy: p>0 (22.98)
based on
Cy=ty:tily)<d_} and Co={y:t,(y)>dy} (22.99) In the case where d, <t,(y) <dy the test is inconclusive (see Maddala (1977) for a detailed discussion of the inconclusive region) For the case Hy: p =0 against H,: p <0 the test statistic 4~—1,(y) should be used in (99)
In view of the discussion at the beginning of this section the Durbin— Watson (DW) test as a test of the independent sample assumption is useful
in so far as it is based on the first-order autocorrelation coefficient r,
Because of the relationship (95) it is reasonable to assume that the test will have adequate power against other forms of first-order dependence such as MA(1) {see King (1983) for an excellent survey of the DW and related tests) Hence, in practice the test should be used not as a test related to an AR(i) error autocorrelation only but as a general first-order dependence test Moreover, the DW-test is likely to have power against higher-order dependence in so far as the first order autocorrelation coefficient ‘captures’ part of this temporal dependence
Higher-order tests
Given that estimation of the linear regression model is easier to handle
under the independent sample assumption rather than when supplemented
with an autocorrelation error model, it should come as no surprise to discover that the Lagrange multiplier test procedure (see Chapter 16) provides the most convenient method for constructing asymptotic misspecification tests for sample independence
In order to illustrate the derivation of misspecification tests for higher- order dependencies among the y,s let us consider the LM test procedure for
the simple AR(1) case which generalises directly to an AR(m) as well as a moving-average (MA(m)) Postulating an AR(1) form of dependency among the y,s is equivalent to changing the statistical GM for the linear regression model to
Trang 2722.3 Testing the independent sample assumption 519 as can be easily verified from (84) The null hypothesis is Hy: p =0 against H,: £0 Because the estimation of (100) under H, is much easier than its estimation under H, the LM-test procedure is computationally preferable to both the Wald and the likelihood ratio test procedures
The efficient score form of the LM-test statistic 1s
¬¬ a L6 (EU), (22.101)
In the case where Hy involves only a subset of 0, say Hy: 0, = 6°, where 0= (0,,0,) then the statistic takes the form
elogL(6ỗ2)\ + x z-.ax (Êlog L6, ổ,) 06, - (;¡—H:l;; lạ¡) 60; Lwu=| (22.102) (see Chapter 16)
Trang 28rejection region
C,=ty: LM(y)>c,} where =| dy7(1) (22.107)
The form of the test statistic in (106) makes a lot of intuitive sense given that the first-order residual correlation coefficient is the best measure of first- order dependence among the residuals Thus it should come as no surprise to learn that for tth-order temporal dependence, say &=pe,-,+u, T21, the LM-test statistic takes the form LM(y)= TF? (22.108) where T iT all y ia.) Š | tS122, ,m<T (22.109) t=r+l ited
(see Godfrey (1978), Breusch and Pagan (1980))
This result generalises directly to higher-order autoregressive and moving-average error models Indeed both forms of error models give rise
to the same LM-test statistic That is, for the statistical GM, y,= Bx, +¢,,
t=1,2, , T; supplemented with either an AR(m) error process:
(i) 6,=Ø18-1 TT FO me im FHS (22.110)
or a MA(m) process:
(it) & =U, +a,u,-, + °° met +a,,u m* (22.111) the LM-test statistic for
Ho: p;=9 (a;=0) for all i=1,2, ,m against
Hy: p;#0 (a,#0) for any i=1,2, ,m takes the same form m Họ LM(y)= r( » 7) ~ z0) (22.112) with rejection region C¡={y:LM(y)>ec,}, =| đz?(m) (22.113) cx
(see Godfrey (1978), Godfrey and Wickens (1982)) The intuition underlying
Trang 2922.4 Looking back 521 of the LM-test statistic the two underlying error processes cannot be distinguished asymptotically
An asymptotically equivalent test, sometimes called the modified LM- test, can be based on the R? of the auxiliary regression (see (80)): E=OX, +9 Ep to mổ m + lạ, (22.114) That is, Họ TR? ~ 77(m) (22.115) x with the rejection region C,=ty: TR*>c,}, =| dy*(m) (22.116) cx
This is a test for the joint significance of+,.72 7,,- similar to (112) above The auxiliary regression for the estimated money equation with m=6 yielded:
TR? =72(0.7 149) = 51.47 (22.117)
Given that c„= 12.592 for z=0.05 the null hypothesis Hy: 7 =0 is strongly rejected in favour of H,: y#0 Hence, the estimated money equation has failed every single misspecification test for independence showing clearly that the independent sample assumption was grossly inappropriate The above discussion also suggests that the departures from linearity, homoskedasticity and parameter time invariance detected in Chapter 21 might be related to the dependence in the sample as well
22.4 Looking back
In Chapters 20-22 we discussed the implications of certain departures from the assumptions underlying the linear regression model, how we can test for these assumptions as well as how we should proceed when these assumptions are invalid In relation to the implications of these departures we have seen that the statistical results (estimation testing and prediction) related to the linear regression model (see Chapter 19) are seriously affected,
with some departures, such as a non-random sample, invalidating these
results altogether This makes the testing for these departures particularly
crucial for econometric modelling This is because the first stage in the
statistical analysis of an econometric model is the specification and estimation of a well-defined (no misspecifications) statistical model Unless we start with a well-defined estimated statistical GM any deductions such
Trang 30estimated parameters, prediction and policy analysis, will be misleading, if not outright unwarranted, given that these conclusions will be based on erroneous statistical foundations
When some misspecification is detected the general way to proceed is to respecify the statistical model in view of the departures from the underlying assumption(s) This sometimes involves the reconsideration of all the assumptions underlying the statistical model such as the case of the independent sample assumption
One important disadvantage of the misspecification testing discussed in Chapters 20-22 is the fact that most of the departures were considered in isolation A more appropriate procedure will be to derive joint mis- specification tests For such tests the auxiliary regressions test procedure discussed in Chapter 21 seems to provide the most practical way forward In particular it is intuitively obvious that the best way to generate a ‘good’ misspecification test is to turn it into a specification test That is, extend the linear regression model in the directions of possible departures from the underlying assumptions in a way which defines a new statistical model which contains the linear regression model as a special case; under the null that the underlying assumptions of the latter are valid (see Spanos (1985c))
Once we have ensured that thé underlying assumptions [1]-[8] of the
linear regression model are valid for the data series chosen we call the estimated statistical GM with the underlying assumptions a well-defined
estimated statistical model This provides the starting point for
reparametrisation/restriction and model selection in an attempt to construct an empirical econometric model (see Fig 1.2) Using statistical procedures based on a well-defined estimated statistical model we can proceed to reparametrise the statistical GM in terms of the theoretical parameters of interest € which are directly related to the statistical parameters @ via a system of equations of the form
G(8, €) =0 (22.118)
The reparametrisation in terms of & is possible only when the system of equations provides a unique solution for € in terms of @:
š¿=H(0) (22.119)
(explicitly or implicitly) In such a case ế is said to be identified As we can see, the theoretical parameters of interest derive their statistical meaning
from their relationship to @ In the case where there are fewer ¿,s than 6,5 the
extra restrictions implied by (118) can (and should) be tested before being imposed
Trang 31Appendix 22.1 523 to a number of possible empirical econometric models This raises the question of model selection where various statistical as well as theory- oriented criteria can be used such as theory consistency, parsimony, encompassing, robustness, and nearly orthogonal explanatory variables (see Hendry and Richard (1983)), in order to select the ‘best’ empirical econometric model This reparametrisation/restriction of the estimated statistical GM, however, should not be achieved at the expense of its statistical properties The empirical econometric model needs to be a well- defined estimated statistical model itself in order to be used for prediction and policy analysis
Trang 32Colt) =m, (t) ~ , (1 Q37 (Nm, (1) Bolt) =@2,()Q,,(1) 77, a{t)=a,,(i,—@, ()Q>)(Has,(i, 9, Bt) =a, (i, 1) — @;(0)G22{ĐA¿¿ t) Important concepts
Error autocorrelation, stationary process, ergodicity, mixing, an inno- vation process, a martingale difference process, strong exogeneity, common factor restrictions, Durbin—Watson test, well-defined estimated statistical GM, reparametrisation/restriction, model selection
10
Questions
Compare the interpretation of non-independence in the context of the autocorrelation approach with that of the respecification approach Compare and contrast the implications of the respecification and
autocorrelation approaches for the properties of the estimators p= (XX) 'Xỹy s?=[L(T-k)]ữñ, ñ=y_—X and the F-test for Hạ:
RB=r against H,: RB#¥r
Explain the role of stationarity, in the context of the respecification approach, for the form of the statistical GM (34)
‘The concept of a stationary stochastic process constitutes a generalisation of the concept of identically distributed random variables to stochastic processes.’ Discuss
‘Inappropriate conditioning due to the non-independence of the sample leads to time dependency of the parameters of interest 0= (B, 07) Explain
Compare the ways to tackle the problem of a non-random sample in the context of the respecification and autocorrelation approaches Explain why we do not need strong exogeneity for the estimation of the parameters of interest in (34)
Explain how the common factor restrictions arise in the context of the
autocorrelation approach
Discuss the merits of testing the independent sample assumption in the context of the respecification approach as compared with that of the autocorrelation approach
‘Rejecting the null hypothesis in a misspecification test for error
Trang 3311 12 13 15 Appendix 22.1 a to wa
Explain the exact Durbin—Watson test
‘The Durbin—Watson test as a test of the independence assumption in the context of the autocorrelation approach is useful in so far as it is directly related to r,.’ Discuss
Discuss the Lagrange multiplier test for higher-order error autocorrelation (AR(m) and MA(m)) and explain intuitively why the test is identical for both models
State briefly the changes in the assumptions underlying the linear regression model brought about by the non-independence of the sample
Explain the role of some asymptotic independence restriction on the
memory of {Z,,t¢€T} in defining the statistical GM for a non-random
sample Exercises
Verify the implications of a non-random sample assumption for Band s? as given by (i)-(vi) and (i)'4vi)’ of Section 22.1 in the context of the respecification and autocorrelation approaches, respectively
Show that [¢ log L(6; y)]/ép =0 where log L(@; y) given in equation (49) is non-linear in p Derive the Wald test for the common factor restriction in the case where k=2 and m= 1 Verify the formula TR? = [(RRSS — URSS)/RRSS]T given in equation (75); see Engle (1984)
Derive the LM-test statistic for Hy: p=0 against H,: p #0 in the case where ¢,= pé,-, +U, aS given in equation (106)
Additional references