CHAPTER 17
Statistical models in econometrics
17.1 Simple statistical models
The main purpose of Parts I] and III has been to formulate and discuss the concept of a statistical model which will form the backbone of the discussion in Part IV A statistical model has been defined as made up of two related components:
(i) a probability model, ®= {D(y; 0), @¢ O}+ specifying a parametric family of densities indexed by 0; and
(ii) a sampling model, y=(y;, ¥2, - Jr)’ defining a sample from D(y; 69), for some ‘true’ @ in O
The probability model provides the framework in the context of which the stochastic environment of the real phenomenon being studied can be
defined and the sampling model describes the relationship between the
probability model and the observable data By postulating a statistical model we transform the uncertainty relating to the mechanism giving rise to the observed data to uncertainty relating to some unknown parameter(s) 0 whose estimation determines the stochastic mechanism D(y; 6)
An example of such a statistical model in econometrics is provided by the
modelling of the distribution of personal income In studying the distribution of personal income higher than a lower limit yy the following statistical model is often postulated:
0 Yo 8+1
(i) D= 4 D(y/yos A= “) yh 0cf`,, y>yo¿: +0 J
(1) y=(¡.ÿ¿, , Yy) isa random sample from D(y/ya;0)
+ The notation in Part IV will be somewhat different from the one used in Parts Il and
IH This change in notation has been made to conform with the established
econometric notation
Trang 3Note:
0\
Eu=r( jo 3) if 8> Ì,
Vary)=yjÍ——„5- —-Ì it o>2 YOK = 10-2) J
For y a random sample the likelihood function is T /Ø0N(y,V*! LỊU; y)= H (2?) =0 y,,y;, , vợ) 690, t=1 XŸo/ Vi r log L(6; y)= T log 0+ TO log yy —(0+ 1) ¥ log y,, t=1 dlogL T ẽ — + T log yo — > log y,=0, t dé) 0=r| ("||
is the maximum likelihood estimator (MLE) of the parameter 6 Since
(d? log L)/d6? = — T/6?, the asymptotic distribution of 6 takes the form (see
Chapter 13):
/ T(8— 0) ~ NI0, 0°)
Although in general the finite sample distribution is not frequently available, in this particular case we can derive D(0) analytically It takes the
form
Ö T-ITT-I T0 2T9
D())=— (Ø) rt | : ~—}, >0 ie.| —-]~z?2T -) ) ie ( 5 ) x27)
(see Appendix 6.1) This distribution of § can be used to consider the finite sample properties of 0 as well as test hypotheses or set up confidence intervals for the unknown parameter 0 For instance, in view of the fact that
E(6) = (¿;}"
we can deduce that Ô is a biased estimator of 0
It is of interest in this particular case to assess the ‘accuracy’ of the asymptotic distribution of @ for a small T, (T=8), by noting that
^ T?8?
Trang 417.1 Simple statistical models 341 (see Johnson and Kotz (1970)) Using the data on income distribution (see Chapter 2), for y> 5000 (reproduced below) to estimate 0, Income lower limit 5000 6000 7000 8000 10 000 12 000 15000 20 000 No of incomes 2600 1890 1150 990 410 220 100 50 we get aap loe(**) | = 1.6 as the ML estimate Using the invariance property of MLE’s (see Section 13.3) we can deduce that £(0)=2.13, Var(6)=0.91
As we can see, for a small sample (T=8) the estimate of the mean and the variance are considerably larger than the ones given by the asymptotic distribution: Ậ2 a 2 0 E(O}= 1.6, Var(t) =; = 0.32 On the other hand, for a much larger sample, say T= 100, E(6) = 1.63, Var(6)=0.028, as compared with E(6)=1.6, Var(0)=0.026
These results exemplify the danger of using asymptotic results for small samples and should be viewed as a warning against uncritical use of asymptotic theory For a more general discussion of asymptotic theory and how to improve upon the asymptotic results see Chapter 10
The statistical inference results derived above in relation to the income
distribution example depend crucially on the appropriateness of the statistical model postulated That is, the statistical model should represent a good approximation of the real phenomenon to be explained in a way which takes account the nature of the available data For example, if the
Trang 5assumptions underlying the statistical model are invalid the above
estimation results are unwarranted
In the next three sections it is argued that for the purposes of econometric modelling we need to extend the simple statistical model based on a random sample, illustrated above, in certain specific directions as required by the particular features of econometric modelling In Section 17.2 we consider the nature of economic data commonly available and discuss its
implications for the form of the sampling model It is argued that for most
forms of economic data the random sample assumption is inappropriate Section 17.3 considers the question of constructing probability models if the identically distributed assumption does not hold The concept of a statistical generating mechanism (GM) is introduced in Section 17.4 in order to supplement the probability and sampling models This additional
component enables us to accommodate certain specific features of econometric modelling In Section 17.5 the main statistical models of
interest in econometrics are summarised as a prelude to the discussion
which follows
17.2 Economic data and the sampling model
Economic data are usually non-experimental in nature and come in one of
three forms:
(i) time Series, Measuring a particular variable at successive points in time (annual, quarterly, monthly or weekly);
(ii) cross-section, measuring a particular variable at a given point in time over different units (persons, households, firms, industries, countries, etc.);
(1) panel data, which refer to cross-section data over time
Economic data such as M1 money stock (M), real consumers’ expenditure (Y) and its implicit deflator (P), interest rate on 7 days’ deposit account (J), over time, are examples of time-series data (see Appendix, Table 17.2) The income data used in Chapter 2 are cross-section data on 23 000 households in the UK for 1979-80 Using the same 23 000 households of the cross- section observed over time we could generate panel data on income In
practice, panel data are rather rare in econometrics because of the
difficulties involved in gathering such data For a thorough discussion of econometric modelling using panel data see Chamberlain (1984)
The econometric modeller is rarely involved directly with the data collection and refinement and often has to use published data knowing very
little about their origins This lack of knowledge can have serious repercussions on the modelling process and lead to misleading conclusions
Trang 617.2, Economic data and the sampling model 343
choice of an appropriate sampling model Moreover, if the choice of the data is based only on the name they carry and not on intimate knowledge about what exactly they are measuring, it can lead to an inappropriate
choice of the statistical GM (see Section 17.4, below) and some misleading
conclusions about the relationship between the estimated econometric model and the theoretical model as suggested by economic theory (see Chapter 1) Let us consider the relationship between the nature of the data and the sampling model in some more detail
In Chapter 11 we discussed three basic forms of a sampling model: (i) random sample — a set of independent and identically distributed
(ID) random variables (r.v.’s);
(ii) independent sample — a set of independent but not identically distributed r.v.’s; and
(iit) non-random sample — a set of non-IID r.v.’s
For cross-section data selected by the simple random sampling method (where every unit in the target population has the same probability of being selected), the sampling model of a random sample seems the most appropriate choice On the other hand, for cross-section data selected by the stratified sampling method (the target population divided into a number of groups (strada) with every unit in each group having the same probability of being selected), the identically distributed assumption seems rather inappropriate The fact that the groups are chosen a priori in some systematic way renders the identically distributed assumption inappropriate For such cross-section data the sampling model of an independent sample seems more appropriate The independence assumption can be justified if sampling within and between groups is random
For time-series data the sampling models of a random or an independent sample seem rather unrealistic on a priori grounds, leaving the non-random sample as the most likely sampling model to postulate at the outset For the time-series data plotted against time in Fig 17 1(a)-(d) the assumption that they represent realisations of stochastic processes (see Chapter 8) seems more realistic than their being realisations of IID r.v.’s The plotted series
exhibit considerable time dependence This is confirmed in Chapter 23 where these series are used to estimate a money adjustment equation In
Chapters 19-22 the sampling model of an independent sample is
intentionally maintained for the example which involves these data series
and several misleading conclusions are noted throughout
In order to be able to take explicitly into consideration the nature of the
Trang 735000 |- § = 25000 |- E a = 15000 |- 5000 Wubitiitliiithii iti ditt td 1963 1966 1969 1972 1975 1978 1982 Time (a) 18000 |- = 2 16000 Ƒ- E a ` 14000 |- 12000 1963 1966 1969 1972 1975 1978 1982 Time (b)
Fig 17.1(a) Money stock £(million) (b) Real consumers’ expenditure
approach in econometrics textbooks (see Theil (1971), Maddala (1977), Judge et al (1982) inter alia) The approach adopted in the present book is to extend the statistical models considered so far in Part HI in order to accommodate certain specific features of econometric modelling In
particular a third component, called a statistical generating mechanism
Trang 817.2 Economic data and the sampling model 345 240 — 200 |- 160 Pd ar 120 |- 80_— 49 Todds tated teva t et de 1963 1966 1969 1972 1975 1978 1982 Time (c) tiiliiiliirliiiliirliirliiiiliiirLiiiliirliiicLiiiLiiriliiiriiiiLiyiliiiEiiilittLiti 1963 1966 1969 1972 1975 1978 1982 Time (d) Fig 17.1(c) Implicit price deflator (d) Interest rate on 7 days’ deposit account
‘an adequate’ approximation to the actual DGP giving rise to the observed data (see Chapter 1) This additional component will be considered extensively in Section 17.4 below In the next section the nature of the probabiiity models required in econometric modelling will be discussed in
Trang 917.3 Economic data and the probability model
In Chapter | it was argued that the specification of statistical models should
take account not only of the theoretical a priori information available but the nature of the observed data chosen as well This is because the specification of statistical models proposed in the present book is based on the observable random variable giving rise to the observed data and not by attaching a white-noise error term to the theoretical model This strategy
implies that the modeller should consider assumptions such as
independence, stationarity, mixing (see Chapter 8) in relation to the observed data at the outset
As argued in Section 17.2, the sampling model of a random sample seems rather unrealistic for most situations in econometric modelling in view of
the economic data usually available Because of the interrelationship
between the sampling and the probability model we need to extend the simple probability model đ={D(y; 6), 0Â@} associated with a random sample to ones related to independent and non-random samples
An independent (but non-identically distributed) sample y=(y,, Vr) raises questions of time-heterogeneity in the context of the corresponding probability model This is because in general every element }, of y has its own distribution with different parameters D(y,; 0,) The parameters 6, which depend on t are called incidental parameters A probability model related to y takes the general form
D= {D(y,; 8,), 6, €®, te T}, (17.1)
where T={1, 2, } is an index set
A non-random sample y raises questions not only of time-heterogeneity
but of time-dependence as well In this case we need the joint distribution of y
in order to define an appropriate probability model of the general form
®=D(y¿,y;, , vr: 6y), 0;e@, T,=(1,2, ,7)ST} (172)
In both of the above cases the observed data can be viewed as realisations of the stochastic process {y,,t¢ T} and for modelling purposes we need to restrict its generality using assumptions such as normality, stationarity and asymptotic independence or/and supplement the sample and theoretical information available In order to illustrate these let us consider the
simplest case of an independent sample and one incidental parameter:
: ly —u
0 9=iturznaslaf2"jk
Trang 1017.3 Data and the probability model 347 (ii) Y=(V¡,Y¿ , yr} 1s an independent sample from D(y,; 6,),¢= 1, 2,
, T, respectively
The probability model postulates a normal density with mean yp, (an
incidental parameter) and variance o? The sampling model allows each y, to
have a different mean but the same variance and to be independent of the other y,s The distribution of the sample for the above statistical model
D(y: 6) where y=(y1, y„ yr) and Ð=(H, Hạ, , Hạ, Ø7) 1s Diy, 0)= [] Dữ tụ, ở) t=1 1 T =(ø?)T2(2m)*” exp) —> À, w-wh (17.3) 20° 424
As we can see, there are T+ 1 unknown parameters, 0= (07, Hy, 2 5 Hr)s
to be estimated and only T observations which provide us with sufficient warning that there will be problems This is indeed confirmed by the maximum likelihood (ML) method The log likelihood is T 1 2 log L(6; y)=const—— logø?—— 3` (y—MjŸ, 2 20° 24 (174) elog L Ch, ob (—2)(y,—m)=0, t=1,2, ,T, 2ø 1 (175) Clog L Oe “=——~13——~ T 1 — 3 =0 17.6 6a2 2g212g4 ÈÚ, Hạ) 0 ( )
These first-order conditions imply that f,=y,,t=1,2, , T, and 6?=0
Before we rush into pronouncing these as MLE'’s it is important to look at the second-order conditions for a maximum é7 log L ôm; =_+ oc? é? log L ag? Gat T 1 2 2308820 Ly) ở fy ae tị” Hy ỡ ø?=¿?
which are unbounded and hence ñ, and ô? are not MLE”s; see Section 13.3
This suggests that there is not enough information in the statistical model
(i)(ii) above to estimate the statistical parameters 0=(y,, HU, -., ps 0”) An obvious way to supplement this information is in the form of panel data for y,, say y,,i=1,2, ,N,t=1,2, , T In the case where N
realisations of y, are available at each t, 8 could be estimated by
1 N
Trang 11and
1
P= =+ > 3> 0w=8)Ẻ I*⁄¬ 1z 1 (178)
1i II
It can be verified that these are indeed the MLE’s of Ø
An alternative way to supplement the information of the statistical model
{i){il) is to reduce the dimensionality of the statistical parameter space © This can be achieved by imposing restrictions on 8 or modelling @ by
relating it to other observable random variables (r.v.’s) via conditioning (see Chapter 7) Note that non-stochastic variables are viewed as degenerate r.v.’s The latter procedure enables us to accommodate theoretical information within a probability model by relating such information to the Statistical parameters @ In particular, such information is related to the mean (marginal or conditional) of r.v.’s involved and sometimes to the
variance Theoretical information is rarely related to higher-order
moments (see Chapter 4)
The modelling of statistical parameters via conditioning leads naturally to an additional component to supplement the probability and sampling models This additional component we call a statistical generating mechanism (GM) for reasons which will become apparent in the discussion
which follows At this stage it suffices to say that the statistical GM is
postulated as a crude approximation to the actual DGP which gave rise to the observed data in question, taking account of the nature af such data as well as theoretical a priori information
In the case of the statistical model (i)}-(ii) above we could ‘solve’ the
inadequate information problem by relating , to a vector of observable variables x4,, X2,-.-, Xig, f=1,2, , T, say, linearly, to postulate
Li, =b'x,, (17.9)
where b=(b,, b>, ., b,)', K<T, is a vector of unknown parameters By
postulating this relationship we reduce the parameter space from O=R’x R, and increasing with T to @,=R* x R., and independent of T
The statistical GM in this case takes the general form
y,=b’x,+u,, teT, (17.10)
where y,=b’x, and u,= y,—b’x, are called systematic and non-systematic components of y,, respectively By construction
E(uu,)=0 and Elu,)=0, EluZ)=o*, E(u,u,)=0,
tés, tse,
where E(-) is defined relative to D(y,; 0), the marginal distribution of y,
Trang 1217.4 The statistical generating mechanism 349 „X„; determines the systematic part of y, with the unmodelled part u, being a white-noise process (see Chapter 8) This is the statistical GM of the Gauss linear model (see Chapter 18) The above statistical GM will be extended in the next section in order to define some of the most widely used statistical models in econometrics
17.4 The statistical generating mechanism
The concept ofa statistical GM is postulated to supplement the probability and sampling models and represents a crude approximation to the actual DGP which gave rise to the available data It represents a summarisation of the sample information in a way which enables us to accommodate any a priori information related to the actual DGP as suggested by economic theory (see Chapter 1)
Let {y,,t€ 1} be a stochastic process defined on (S, F P(-)) (see Chapter
8) The statistical GM is defined by
}¿=H,+u, cet, (17.11)
where
u=E\y,2), GoF, (17.12)
2 being some ø-field Thịs defines the statistical process generating y, with Ht, being the postulated systematic mechanism giving rise to the observed data on y, and u, the non-systematic part of y, defined by u,=y,—4, Defining u, this way ensures that it is orthogonal to the systematic component p,; denoted by y,Lu, (see Chapter 7) The orthogonality condition is needed for the logical consistency of the statistical GM in view of the fact that u, represents the part of y, left unexplained by the choice of p, The terms systematic, non-systematic and orthogonality are formalised in terms of the underlying probability and sampling models defining the statistical model
It must be emphasised at the outset that the terms systematic and non- systematic are relative to the information set as defined by the underlying probability and sampling models as well as to any a priori information related to the statistical parameters of interest 0 This information is incorporated in the definition of the systematic component and the remaining part of y, we call non-systematic or error Hence, the nature of u, depends crucially on how y, is defined and incorporates the unmodelled part of y, This definition of the error term differs significantly from the usual use of the term in econometrics as either errors-in-equation or errors of measurement The use of the concept in the present book comes much
Trang 13Kalman (1982)) Our aim in postulating a statistical GM is to minimise the non-systematic component u, by making the most of the systematic information in defining the systematic component p, For more discussion on the error term see Hendry (1983)
Let {Z,,t€ 7} beak x 1 vector stochastic process defined on (S, ¥ P(-)) which represents the observable random variables involved Let y, be the random variable whose behaviour is of interest, where
z=(X)) For a conditioning information set 2 t
the systematic component of y, can be defined by
H=E(y,/Z), ted, (17.13)
where @, is some sub-o-field of # The non-systematic component u,
represents the unmodelled part of y, given ,, Le
u,=y,—El(y,/Z), ted (17.14)
These two components give rise to the general statistical GM
y= Ely,/G)t+u, tet, (17.15)
where by construction,
(i) Eu,/2,)= EU T— EU/2))/2/1=0: (17.16)
(ii) E(u,u,/L) = yw, E(u,/Z,) =0; (17.17)
using the properties of conditional expectation (see Chapter 7) It is important to note at this stage that the expectation operator E(-)in (16) and (17) is defined relative to the probability distribution of the underlying probability model By changing &, (and the related probability model) we
can define some of the most important statistical models of interest in
econometrics Let us consider some of these special cases
(a) Assuming that {Z,, t¢ T} is a normal IID stochastic process and choosing Y,= {X,=x,}, a degenerate ø-field, (15) takes the special
form
=fX,+u, teT, (17.18)
where the underlying probability model is based on D(y,/X,; 9) This defines the linear regression model (see Chapter 19)
(b) Assuming that {Z,, t€ 7} is a normal IID stochastic process and
choosing Y, = o(X,), (15) becomes
Trang 1417.4 The statistical generating mechanism 351
with D(Z,; ý) being the distribution defining the probability model (19) represents the statistical GM of the stochastic linear regression model (see Chapter 20)
(c) Assuming that {Z,,t¢ 1} isa normal stationary /th-order Markov process and choosing the appropriate o-field to be D,= oye, X?P=x?) v2 ¡=Ú,-„ i=l2, ), X?=(X,.; i=0,1,2, ), (15) takes the form ‡ ,= Box, + » (%;y,—¡ + :X; —¡) + tụ, (17.20) i=1
where the underlying probability model is based on D(y,/y?_1.X?
0,) This defines the statistical GM of the dynamic linear regression model (see Chapter 23)
(d) Assuming that {Z,,t¢ 1} is a normal IID stochastic process and y, isan mx 1 subvector of Z,, the o-field Y, = o(X, =x,) reduces (15) to
y,=Bx,+u, r€T, (17.21)
with D(y,/X,;6*) the distribution defining the underlying probability model This is the statistical GM of the multivariate linear regression model (see Chapter 24)
An important feature of any statistical GM is the set of parameters
defining it These parameters are called the statistical parameters of interest For instance, in the case of (18) and (19) the statistical parameters of interest
are 0=(B,07), B=Xj}6,, 67 =0,, — 0,77 02, These are functions of the
parameters of D(Z,;w) assumed to be
[Mt ~ O\/o1 O12
mel) sa) 733
(see Chapter 15)
In practice the statistical parameters of interest Ø might not coincide with
the theoretical parameters of interest, say € In such a case we need to relate the two sets of parameters in such a way that the latter are uniquely determined by the former That is, there exists a mapping
&=H(6), (17.23)
which define € uniquely This situation for example arises in the case of the simultaneous equations model where the statistical parameters of interest are
the parameters defining (2 1) but the theoretical parameters are different (see
Trang 15It must be stressed that the statistical GM postulated depends crucially on the information set chosen at the outset and it is well defined within such a context When the information set is changed the statistical GM should be respecified to take account of the change This implies that in econometric modelling we have to decide on the information set within which the specification of the statistical model will take place This is one of the reasons why the statistical model is defined directly in terms of the random variables giving rise to the available observed data chosen and not in terms of the error term The relevant information underlying the specification of the statistical GM comes in three forms:
() theoretical information; (ii) sample information; and
(H) measurement information
In terms of Fig 1.2 the theoretical information relates to the choice of the observed data series (and hence of Z,) and the form of the estimable model
The sample information relates to the probabilistic structure of {Z,,t¢ 7}
and the measurement information to the measurement system of Z, and any exact relationships among the observed data chosen (see Chapter 26 for further discussion) Any theoretical information which can be tested as restrictions on @# is not imposed a priori in order to be able to test it An important implication of this is that the statistical GM is not restricted a
priori to coincide with either the theoretical or estimable model apart from
a white-noise term at the outset Moreover, before any theoretical meaning
is attached to the statistical GM we need to ensure that the latter is first
well-defined statistically; the underlying assumptions defining the statistical model are indeed valid for the data chosen Testing the underlying assumptions is the task of misspecification testing (see Chapters 20-22) When these assumptions are tested and their validity established we
can proceed with the reparametrisation/restriction in order to derive a
theoretically meaningful GM, the empirical econometric model (see Fig
1.2)
17.5 Looking ahead
As a prelude to the extensive discussion of the linear regression model and related statistical models of interest in econometrics let us summarise these in Table 17.1
In the chapters which follow the statistical analysis (specification,
misspecification, estimation and testing) of the above statistical models will be considered in some detail In Chapter 18 the linear model is briefly considered in its simplest form (k= 2) in an attempt to motivate the linear
Trang 17reason for the extensive discussion of the linear regression model is that this statistical model forms the backbone of Part IV In Chapter 19 the estimation, specification testing and prediction in the context of the linear regression model are discussed Departures from the assumptions
(misspecification) underlying the linear regression model are discussed in Chapters 20-22 Chapter 23 considers the dynamic linear regression model
which is by far the most widely used statistical model in econometric modelling This statistical model is viewed as a natural extension of the linear regression model in the case where the non-random sample is the
appropriate sampling model In Chapter 24 the multivariate linear regression model is discussed as a direct extension of the linear regression
model The simultaneous equation model viewed as a reparametrisation of
the multivariate linear regression model is discussed in Chapter 25 In
Chapter 26 the methodological discussion sketched in Chapter 1 is
considered more extensively
Important concepts
Time-series, cross-section and panel data, simple random sampling,
stratified sampling, incidental parameters, statistical generating
Trang 18Appendix 17.1 355
Appendix 17.1