THE GAUSS LINEAR MODEL

Trang 1

CHAPTER 18 The Gauss linear model 18.1 Specification

In the context of the Gauss linear model the only random variable involved is the variable whose behaviour is of interest Denoting this random

variable by y, we assume that the stochastic process {y,,t€ T} isa normal,

independent process with E(y,)=, and a time-homogeneous variance o?

for te T (J being some index set, not necessarily time),

y,~N(u,,07), teT, (18.1)

defined on the probability space (S, ¥ P(-))

In terms of the general statistical GM (17.15) the relevant conditioning

set is the trivial o-field Y= {S, @* which implies that

= E(y,/Zo) = E(y,)

That is, the statistical GM is

y,= Ely, + uy, tel, (18.2)

with pz, assumed to be related to a set of k non-stochastic (or controlled)

Trang 2

the statistical GM (2) takes the particular form

=bÐx,+u, teT (18.5)

The underlying probability is naturally defined in terms of the marginal distribution of y,, say, D(y, 0), where @=(b,o{) are the statistical parameters of interest, being the parameters in terms of which the statistical GM (5) is defined The probability model is defined by

1 1

®= {Pv ®=¿ 72a) oP} cám vinbsrl,

OER x @.reT} (18.6) In view of the assumption of independence of {y,, t¢ 1} the sampling

model, providing the link between the observed data and the statistical GM, is defined as follows:

Y EU, W2; YrŸ

is an independent sample from D(y,; 6), t= 1,2, , T; respectively It could

not bea random sample in view of the fact that each y, has a different mean

By construction the systematic and non-systematic components satisfy the following properties:

(i) Eu,)= E(y,— EUy,) =0;

(ti) E(u,u,) = H,E(u,) =0;

ơ?, t=s

i) MU uy nse

Properties (i) and (iti) show that {u,, t¢ 1} is a normal white-noise process and (ii) establishes the orthogonality of the two components It is important to note that the distribution in terms of which the above expectation operator E(-) is defined is none other than D(y,; 09), the distribution

underlying the probability model with 6, the ‘true’ value of 6

The Gauss linear model is specified by the statistical GM (5), the

probability model (6) and the sampling model defined above Looking at

this statistical model we can see that it purports to model an ‘experimental-

like’ situation where the x,,s are either fixed or controlled by the

experimenter and the chosen values determine the systematic component of y, via (3) This renders this statistical model of limited applicability in

Trang 3

18.2 Estimation 359

the x„s on probabilistic grounds by assuming y, is a random variable and the x¡s non-stochastic or controlled variables In econometric modelling,

however, apart from a time trend variable, say x,=r, reT, and dummy variables taking the value zero or one by design, it is very difficult to think of non-stochastic or controlled variables

The Gauss linear model is of interest in econometrics mainly because it enhances our understanding of the linear regression model (see Chapter 19) when the two are compared The two models seem to be almost identical notation-wise, thus causing some confusion; but a closer comparison reveals important differences rendering the two models applicable to very different situations This will be pursued further in the next chapter

18.2 Estimation

For expositional purposes let us consider the simplest case where there are

only two non-stochastic variables (k=2) and the statistical GM of the

Gauss linear model takes the simple form

y= b,+b2x,,+u, teT (18.7)

The reason for choosing this simple case is to utilise the similarity of the mathematical manipulations between the Gauss linear and linear

regression models in order to enhance the reader’s understanding of the

matrix notation used in the context of the latter (see Chapter 19) The first variable in (7) takes the value one for all t, commonly called the constant (or intercept)

Trang 4

(see Section 13.3) The log likelihood takes the form:

T T

log L(6; y)= ~3 log 2x—+ log ø?

1

~3g2 2 (bị —bzX)Ÿ (18.10)

The first-order conditions for the derivation of the maximum likelihood estimators (MLE’s) are: Clog L 1 _¬ ~zzz(~2)3(w—bị —b;x,)=0, (18.11) ơlog L I ab ID 4 = ~z„z(~2)3,0i=bi —b¿x,)x,=0, (18.12) t dlog L T 1

agz = Tag? t Tod le b1 ~ bax)? =O t (18.13)

Solving (11}{13) simultaneously we get the MLE’s b, =y—b,x, (18.14) 3_.(y,— Ø(x,—%) B,= ———————, (18.15) ¥ (x, - x)? t where se > r=äd Y and 18.16 vor - Vy x—m : Xt> ~T > t ( + )

Trang 5

18.2 Estimation 361 For @=(b,, 63, 07), the sample information matrix I;(0) and its inverse are (5) » x 0 Ø a I;(6)= 2x Lx 0 (18.19) Ø and [0] '= | ~° Lx, Fg (18.20) Note that I,(0) is positive definite (1,(0) > 0) if ), (x, —x)? 40, ie there must be at least two distinct values for x, This condition also ensures the

existence of 6; as defined by (15)

Properties of 6=(b,, 55, 6?)

(1) Asymptotic properties

The fact that 6 is a MLE enables us to conclude that if the asymptotic information (matrix) defined by /,(0)=lim;_, [(1/T)1@)] is positive

definite (see Chapters 12-13) then:

P

() Ô ~› 0,i.e Ô1s a consistent estimator of Ø (I[Š /—¡ x,—> % as T %);

(11) v/'Tiô~9) ~ N(0.[1,„(Ø)]” 1, ie @ is asymptotically normal: (H) E,(0)=9, i.e asymptotically unbiased (the asymptotic mean of 6

is 0);

Trang 6

I,,(0) is positive definite if det(I,,(6)) >0; this is the case if

1

lim (F Ets? 4.20 (18.21)

Tox T t

(2) Finite sample properties 6 being a MLE we can deduce that:

(v) 6 is a function of the set of minimal sufficient statistics

T T T

a)=( ves Vive 2, vai) (18.22)

t=1 t=1 t=1

(vi) 6 is invariant with respect to Borel functions, ie ifh(@): © > © then

the MLE of h(@) is h(6); see Section 13.3

In order to consider any other small (finite) sample properties of 6 we need

to derive its distribution Because the mathematical manipulations are

rather involved in the present case no such manipulations are attempted It turns out that these manipulations are much easier in matrix notation and will be done in the next chapter for the linear regression model which when reinterpreted applies to the present case unaltered b, bị Var(P,) — Cov(B,,b,) (vu) (P)= MU coun hy Var(b>) )) (18.23) where ø25 x? t Var(b,) = +——., uv T3 (x—x)? _ gy Cov(by, bh») = = , Y (x, - x7? 2 Var(B,)=— —— ¥ (x, —x)? t

This result follows from the fact that 6, = y—6,x, 6, =¥, 4, —), where A,=[(x,-x)]/ES), (x, —)7], and linear functions of normally distributed

random variables and thus themselves normally distributed (see Section 6.3) The distribution of é? takes the form

Tc?

Trang 7

18.3 Hypothesis testing and confidence intervals 363

where y7(T — 2) stands for the chi-square distribution with T —2 degrees of

freedom (2) follows from the fact that Té?/o?= 7 , (i,/o)* involves T—2

independent squared standard normally distributed random variables (viii) From (vii) it follows that E(6,)=b,, E(b,)= bp, ie b, and 6, are

unbiased estimators of b, and b, respectively On the other hand, since the mean of a chi-square random variable equals its degrees

of freedom (see Appendix 6.1) 22 eS) T-2> — a7 #07, ie 6? is a biased estimator of o7, but the estimator s?= [1(T—2)] ¥, a? is unbiased and (T—2)s?

(ix) (6,, 55) are independent of s? (or 62)

This can be verified by considering the covariance between them (x) Comparing (23) with (20) we can see that (b,,55) achieve the

Cramer—Rao lower bound and hence we can deduce that they are fully efficient Given that ¢? is biased the Cramer—Rao given by (20)

is not applicable, but for s* we know that

— e2

var(‘T) =a7-2 g2 =>

2ø* 2ø!

Var(s?)= To? >——_ F = the Cramer—Rao bound t —

Thus, although s* does not achieve the Cramer—Rao lower bound, no other unbiased estimator of a” achieves this bound

18.3 Hypothesis testing and confidence intervals

In setting up tests and confidence intervals the distribution of 6 and any

pivotal quantities thereof are of paramount importance Consider the null hypothesis

Ho:b,=6, against H,:b,#b,, 6, being a constant Intuition suggests that the distance |b,—6,|, scaled by its standard

Trang 8

basis for a ‘good’ test statistic Given that

Ê,~B,)* (B,~b,?

ai) =— TH —xz⁄(D, Pf Sx? t (18.26)

TY (x,-x)?

this is not a pivotal quantity unless a? is known Otherwise we must find an alternative pivotal quantity Taking (25) and (26) together and using the independence between 5, and s* we can set up the pivotal quantity (6, —5,) \ 2» Ệ P32 ñ \2 rat On oS AF T-2, (7X06 —9\ {| ‘ “re ~2)s? Jie (T—2)ø? = 6-5 (18.27) Sanh! —2), (18.28) N The rejection region for a size x test is 6, — 6, , Cy=4y: patil eat where isz=| di7—2) (18.29) " V[Var(b,)] ex

Using the duality between hypothesis testing and confidence intervals (see Section 14.5) we can construct an (1 —«) level confidence interval for by

Trang 9

18.3 Hypothesis testing and confidence intervals 365

Similarly, for Hy: 5, = 6, against 6, ¥ b, the rejection region of a size 2 test is

ơ 1(b;)= TấanB,)] â (18.32)

A (I—z) confidence Interval 1s

x, — x)? 3,(x,—x)”

(18.33)

Consider H 9:7 = 6? against H,:07 #6” The pivotal quantity (25) can be

used directly to set up the acceptance region —2)s2 Comfy co}, Pr(Cy)= 1-4, (18.34) oO such that 4

Pr fina <a) = Pryir_n > b)= 5 (18.35)

A (1—ø) level confidence interval is

— 2 _ 2

Cụ)=|zẻ ToS co TIE (18.36)

Remark: One-sided tests can be easily constructed by modifying the above two-sided results; see Chapter 14

Trang 10

These results imply that the distribution of â, is normal and

: (Ú, —H,)

(1) [Varta] ~ N(O, 1); (18.38)

_ (fi, — Hy) ~ _

(ii) t(y)= [Varia] tT~— 2) (18.39)

For c, such that f° d(T —k)= 1-4, we could construct a (1—«a) confidence interval of the form ˆ 1 x, — Xx)? ˆ c= a ines | +2 Jancis » (x; —x)? 1 (x, — x)? eS} YO, —x)? (18.40) This confidence interval can be extended to t> T in order to provide us with

a prediction confidence interval for u;,;, /21 1 _ x\2 Cụ)=beueár.=es [e+ BO em, 3.(x,—x)? t 2 <irates + 2=2 It (18.41) V(x, - x)? t

(see Chapters 12 and 14 on prediction)

In concluding this section it is important to note that the hypothesis

testing and confidence interval results derived above, as well as the estimation results of Section 18.2, are crucially dependent on the validity of

the assumptions underlying the Gauss linear model If any of these assumptions are in fact invalid the above results are unwarranted to a

greater or lesser degree (see Chapters 20-22 for misspecification anal ysis in

the context of the linear regression model) 18.4 Experimental design

Trang 11

18.5 Looking ahead 367

respectively are distributed as bivariate normal as shown in (23)

The fact that the x;s are often controlled variables enables us to consider

the question of ‘designing’ the statistical GM (5) so as to ensure that it satisfies certain desirable properties such as robustness and parsimony

These can be achieved by choosing the x,s and their values appropriately

Looking at their variances and covariances we can see that we could make b, and 6, more ‘accurate’ by choosing the values of x, In a certain way Firstly, if x=0 then

Cov(b,, 6,)=0 (18.42)

and 5, and 6, are now independent This implies that if we were to make a

change of origin in x, we could ensure that 6, and 6; are independent Secondly, the variances of b, and 6, are minimised when 9", x? (given x=0) is as large as possible This can be easily achieved by choosing the value of x,

to be on either side of zero (to achieve x =0) and as large as possible For example, we could choose the x,s so that

(18.43)

Xypyp Xp TN

(T even) and n is as large as possible; see Kendall and Stuart (1968) Another important feature of the Gauss linear model is that repeated observations on y can be generated for some specified values of the x,s by

repeating the experiment represented by the statistical GM (7)

18.5 Looking ahead

From the econometric viewpoint the linear control knob model can be seen to have two questionable features Firstly, the fact that the x,,s are assumed

to be non-stochastic reduces the applicability of the model Secondly, the independent sample assumption can be called into question for most economic data series In other disciplines where experimentation is possible the Gauss linear model is a very important statistical model The purpose of the next chapter is to develop a similar statistical model where the first questionable feature is substituted by a more realistic formulation of the systematic component The variables involved are all assumed to be random variables at the outset

Important concepts

Non-stochastic or controlled variables, residuals, experimental design,

Trang 12

Questions

Explain the statistical GM of the Gauss linear model

Derive the MLE’s of b and o? in the case of the general Gauss linear model where y,=b’x,+u,,f=1,2, , 7, x, being ak x | vector of non-

stochastic variables, and state their asymptotic properties

Explain under what circumstances the MLE’s 5, and 5, of b, and b, respectively are independent Can we design the values of the non- stochastic variables so as to get independence?

Explain why the statistic |b, |/Var(b,) is distributed as t(T —2) and use it to set up a test for

Hy: b,=0 against H,: 6, #0, as well as a confidence interval for b, Verify that b and 6? are independent

Additional references

Định dạng
Số trang	12
Dung lượng	302,16 KB