identification cheng hsiao university of toronto sổ tay kinh tế lượng

61 354 0
identification  cheng hsiao university of toronto sổ tay kinh tế lượng

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Chapter 4 IDENTIFICATION CHENG HSIAO* University of Toronto Contents 1. Introduction 224 2. Basic concepts 226 3. Contemporaneous simultaneous equation models 221 3.1. The model 227 3.2. Observationally equivalent structures 228 3.3. Identification in terms of trivial transformations 231 3.4. Identification in terms of “linear estimable functions” 234 3.5. Examples 240 4. Dynamic models with serially correlated residuals 242 4.1. The model 242 4.2. Observationally equivalent structures 245 4.3. Linear restrictions on the coefficients 247 4.4. Additional information about the disturbances 251 5. Non-linear a priori constraints and covariance restrictions 255 5.1. Some useful theorems 255 5.2. Contemporaneous error-shock models 257 5.3. Covariance restrictions 265 6. Bayesian theory of identification and concluding remarks 271 Appendix 276 References 280 *This work was supported by National Science Foundation Grant SESSO-07576 and by Social Sciences and Humanities Research Council of Canada Grant 410-80-0080. I am indebted to V. Bencivenga, J. Bossons, M. Deistler, Z. Griliches, F. M. Fisher, J. C. Ham, E. J. Hannan, M. D. Intriligator, J. B. Kadane, E. Learner, K. Mahjoob, T. Rothenberg, and A. Zellner for helpful comments and discussions. All remaining errors are my own responsibility. Handbook of Econometrics, Volume I, Edited by Z. Griliches and M.D. Intriligalor 0 North-Holland Publishing Company, I983 224 C. Hsiao 1. Introduction The study of identification has been aptly linked to the design of experiments. In biological and physical sciences an investigator who wishes to make inferences about certain parameters can usually conduct controlled experiments to isolate relations. Presumably, in a well-designed experiment the treatment group and the control group are similar in every aspect except for the treatment. The difference in response may therefore be attributed to the treatment and the parameters of interest are identified. In economics and other social sciences we are less fortunate. We observe certain facts (which are usually characterized by a set of quantities) and wish to arrange them in a meaningful way. Yet we cannot replace the natural conditions by laboratory conditions. We cannot control variables and isolate relations. The data are produced by an unknown structure, where the effects of a change in this structure are the objects of our study. None of these changes was produced by an investigator as in a laboratory experiment, and often the impact of one factor is confounded with impacts of other factors. To reduce the complex real-world phenomena into manageable proportions, an economist has to make a theoretical abstraction. The result is a logical model presumably suited to explain the observed phenomena. That is, we assume that there exists an underlying structure which generated the observations of real-world data. However, statistical inference can relate only to characteristics of the distribution of the observed variables. Thus, a meaningful statistical interpreta- tion of the real world through this structure can be achieved only if our assumption that real-world observations are generated by this structure and this structure alone is also true. The problem of whether it is possible to draw inferences from the probability distribution of the observed variables to an underlying theoretical structure is the concern of econometric literature on identification. We now illustrate this concept using the well-known demand and supply example. Example 1.1 Let pt = price at time t, qt = quantity at time t. Linear approximations of the demand and supply functions are q1 = a + bp, + ulr (demand), 4t = c + dP, + U2t (supply) f O-1) (1.2) Ch. 4: Identification 225 Assume U, and u2 have an independent (over time) bivariate-normal distribution Solving for p, and qt we obtain the distribution of the observed variables: N a-c d-b da - bc d-b (1.3) 01 I + 022 - 2012 do,,-(b+d)u,,+bu,,’ (d - b)2 (d - b)2 da,,-(b+d)u,,+bu,, d2u,, + b2u2, -2dbu,2 (d - b)2 (d - b)2 / (1.4 There are five functions of parameters that can be estimated. But there are seven parameters of interest. Therefore, in general we can only estimate these functions of the parameters and not any parameter itself. As is obvious, there are infinitely many possible values of (a, b, c, d, u,,, a,,, and u12) which could all generate the observed data (p,, qt)_ Consequently, without additional information (in the form of a priori restrictions), the model specified by (1.1) and (1.2) cannot be estimated and therefore is not useful in confronting economic hypotheses with data. The study of identifiability is under- taken in order to explore the limitations of statistical inference (when working with economic data) or to specify what sort of a priori information is needed to make model parameters estimable. It is a fundamental problem concomitant with the existence of a structure. Logically it precedes all problems of estimation or of testing hypotheses. The general formulation of the identification problems were made by Frisch (1934), Haavelmo (1944), Hurwicz (1950), Koopmans and Reiersol (1950), Koopmans, Rubin and Leipnik (1950), Marschak (1942), Wald (1950), Working (1925, 1927) and others. An extensive study of the identifiability conditions for the simultaneous equations models under various assumptions about the underly- ing structures was provided by Fisher (1966). In this chapter I intend to survey the development of the subject since the publication of Fisher’s book, although some pre-1966 results will be briefly reviewed for the sake of completeness. Because the purpose of this chapter is expository, I shall draw freely on the work by Anderson (1972), Deistler (1975, 1976), Deistler and Schrader (1979) Drbze (1975), Fisher (1966), Hannan (1969, 1971), Hatanaka (1975), Johnston (1972) Kadane (1975), Kohn (1979) Koopmans and Reiersol (1950), Preston and Wall ( 1973), Richmond (1974), Rothenberg (197 l), Theil(197 l), Wegge (1965), Zellner (1971) etc. without specific acknowledgement in each case. 226 C. Hsiao In Section 2 we define the basic concepts of identification. Section 3 derives some identifiability criteria for contemporaneous simultaneous equation models under linear constraints; Section 4 derives some identifiability criteria for dy- namic models. Section 5 discusses criteria for models subject to non-linear continuous differentiable constraints and covariance restrictions with special emphasis on the applications to errors in variables and variance components models. The Bayesian view on identification and concluding remarks are given in Section 6. 2. Basic concepts’ It is generally assumed in econometrics that economic variables whose formation an economic theory is designed to explain have the characteristics of random variables. Let y be a set of such observations. A structure S is a complete specification of the probability distribution function of y, P(y). The set of all Q priori possible structures S is called a model. The identification problem consists in making judgements about structures, given the model S and the observations y. In most applications, y is assumed to be generated by a parametric probability distribution function P(ylS) = P(yla), w h ere (Y is an m-dimensional real vector. The probability distribution function P is assumed known, conditional on CX, but a: is unknown. Hence, a structure is described by a parametric point (Y, and a model is a set of points A c R”. Thus, the problem of distinguishing between structures is reduced to the problem of distinguishing between parameter points. In this framework we have the following definitions. Definition 2. I Two structures S = (Y and ?? = a: in A are said to be observationally equivalent if P( ylcu) = P( J@) for all y. Definition 2.2 The structure. So = a0 in S is “globally identified” if there is no other (Y in A which is observationally equivalent. Since the set of structures is simply a subset of R”‘, it is possible that there may be a number of observationally equivalent structures, but they are isolated from each other. It is natural then to consider the concept of local identification. We define this concept in terms of the distance between two structures. ‘Professor F. Fisher has pointed out to the author that “overidentification” is part of the general concept of identification and we ought to distinguish collinearity and lack of identification. The concept of “overidentification” has been found relevant for the existence of sampling moments and the efficiency of the estimates in simultaneous-equations models, presumably the topics will be treated in the chapters on estimation and sample distribution. The problem of collinearity is a case of under-identification according to Definition 2.1. So in this chapter we ignore these concepts. Ch. 4: Identification Definition 2.3 227 A structure So = (Y’ is “locally identified” if there exists an open neighborhood W, containing (Y’ such that no other (Y in w is observationally equivalent to (y”. On many occasions a structure S may not be identifiable, yet some of its characteristics may still be uniquely determinable. Since the characteristics of a structure S are described by an m-dimensional real vector ar, we define this concept of identifiability of a substructure in terms of functions of (Y. Definition 2.4 Let ,$(a) be a function of (Y. [(a) is said to be (locally) identifiable if (there exists an open neighborhood w, such that) all parameter points which are observation- ally equivalent have the same value for .$(a) (or lie outside o). A special case of [(a) will be coordinate functions. For instance, we may be interested in the identifiability of a subset of coordinates (Y, of (Y. Then the subset of coordinates of of (Y’ is said to be locally identifiable if there exists an open neighborhood o, containing CX’, such that all parameter points observationally equivalent to (Y’ have the same value for ap or lie outside w. In this chapter, instead of deriving identifiability criteria from the probability law of y [Bowden (1973), Rothenberg (1971)] we shall focus on the first- and second-order moments of y only. If they are normally distributed, all information is contained in the first- and second-order moments. If the y are not normally distributed, observational information apart from the first and second moments may be available [Reiers@ (1950)]. However, most estimation methods use second-order quantities only; also, if a structure is identifiable with second-order moments, then it is identifiable with a probability law [Deistler and Seifert (1978)]. We shall therefore restrict ourselves to the first and second moments of y (or identifiability in the wide sense) for the sake of simplicity. Thus, we shall view two structures as observationally equivalent if they produce identical first- and second-order moments. Consequently, all the definitions stated above should be modified such that the statements with regard to the probability law of y are replaced by corresponding statements in terms of the first and second moments ofy. 3. Contemporaneous simultaneous equation models 3.1. The model In this section we consider the identification of a contemporaneous simultaneous equation model. We first discuss conditions for two structures to be observation- ally equivalent. We then derive identifiability criteria by checking conditions 228 C. Hsiao which will ensure that no two structures or part of two structures are observation- ally equivalent. Finally, we illustrate the use of these conditions by considering some simple examples. For simplicity, let an economic theory specify a set of economic relations of the form By,+rx,=u,, t=l T, Y Y (3.1) where y, is a G X 1 vector of observed endogenous variables; x, is a K X 1 vector of observed exogenous variables; B is a G X G matrix of coefficients; r is a G X K matrix of coefficients; and u, is a G X 1 vector of unobserved disturbances. We assume that Assumption 3.1 B is non-singular. Assumption 3.2 lim,,,CT ,_, x,x:/T exists and is non-singular. Assumption 3.3 Eu, = 0, Eu,x: = 0, and Eu,u; = 2 ift=s, 0 otherwise. We note that the criteria to be derived can allow for the existence of lagged endogenous variables or autocorrelation in the disturbance term, but not both. We shall indicate how these generalizations can be made in Section 3.4. 3.2. Observationally equivalent structures Suppose u, has the density P( z&2), then the joint density of (u,, . . . , uT) is l? P(U). (3.2) t=1 The joint density of (y , , . . . ,yr) can be derived through the structural relation Ch. 4: Identification (3.1) and the density of u’s Conditional on x,, we have: 229 (3.3) Suppose that we multiply (3.1) through by a G x G non-singular matrix F. This wouid involve replacing each equation of the original structure by a linear combination of the equations in that structure. The new structure may be written as (FB)y, +(Fr)x, = wt, (3.4) where wI = Fu,, so that P(wJF2F’) =ldet FI-‘P(uJ2). (3.5) The conditional density for the endogenous variables determined from the new structure is ldetFBl~.~~,P(~,lFTP)=ldetFl’ldetBl~ldetFl-~~~,P(u,lB) = ldet BIT- ,fi, P(u,lz), (3.6) which is identical to the density (3.3) determined from the original structure. Hence, we say that the two structures (3.1) and (3.4) are observationally equiva- lent. A special case of (3.4) occurs when we set F = B-’ so that the transformed structure becomes y, = IIX, + 2) f’ where I3I7+r=o, v, = B-l+ Ev, v, = B-‘JIB-“=V ift=s, 0 otherwise. (3.7) (3-g) (3.9) 230 C. Hsiuo Eq. (3.7) is called the “reduced form” of the “structural system” (3.1). We can alternatively write down the density of y in terms of the reduced form parameters (II, V) as From (3.4) and (3.6) we know that (3.3) and (3.10) yield identical density functions for the endogenous variables. Thus, if we postulate a set of structural relations (3.1) with reduced form (3.7), then all structures obtained by premulti- plying the original structure by an arbitrary non-singular matrix of order G will have this same reduced form, and moreover, all these structures and the reduced forms will be observationally equivalent.2 Given that we will focus only on the first and second moments, we may formally state the conditions for two structures to be observationally equivalent in the following lemma. Lemma 3.2.1 Two structures S = (B, r, 2) and S = (B, r, 2) are observationally equivalent if and only if the following equivalent conditions hold: (i) B-IT= B- ‘r and B- ‘EB- 1’ = jj- ‘EB- 1’. (ii) There exists a non-singular matrix F such that (3.11) and x= FEF’. (3.12) Proof (i) follows from (3.1), (3.7), and (3.10). The probability density of the data is assumed to be completely specified by the first and second moments, i.e. the reduced-form parameters (II, V). If S and S are observationally equivalent, they must have identical reduced-form parameter matrix and variance-covariance matrix. Condition (i) is exactly the condition which must be satisfied for the two reduced forms to be equal, and thus (i) is necessary and sufficient. Now consider (ii). Sufficiency of (ii) is easy to check using (i). To prove its necessity, suppose S and S are observationally equivalent. Let F = BB-‘. As- sumption 3.1 implies that F is non-singular. Then B- ‘r = B- ‘r implies that (3.11) holds. Let wI = Fu,, we have (3.12). 21n other words, we restrict our attention to the class of models which have identifiable reduced forms. Ch. 4: Identification 231 If there were no a priori restrictions on the parameters of the model (3.1) any non-singular F will be admissible in the sense that the transformed structure satisfies the same restrictions of the model3 The situation would indeed be hopeless. An infinite set of structures would have generated any set of sample observations. If economic theories are available to specify a set of a priori restrictions on the model, any transformed structure must also satisfy the same a priori restrictions if the transformed structure is to belong to the model that has been specified (i.e. the transformation is admissible). For instance, suppose economic theory specifies that (3.1) must obey certain restrictions, then the transformed structure (3.4) will have to obey these same restrictions. A priori information on the parameters of the model thus would rule out many structures which would otherwise be observationally equivalent, i.e. they imply restrictions on the elements of F. In this section we shall assume that Assumption 3.4 All prior information is in the form of linear restrictions on B and r. We ignore the information contained in the variance-covariance matrix for the moment. We shall discuss this situation together with non-linear a priori restric- tions in Section 5. The identification problem thus may be stated as (a) If one considers the transformation matrix Fused to obtain the transformed structure (3.4), do the a priori restrictions on B and r imply sufficient restrictions on the elements of F to make some or all of the coefficients in the original and transformed structures identical (and thus identifiable)? Since Assumption 3.2 ensures the identifiability of II (which is consistently estimable by the ordinary least squares method), we may state the identification problem in an alternative but equivalent fashion. (b) Assuming the elements of II to be known can one then solve for some or all of the elements of B and r uniquely? 3.3. Identification in terms of trivial transformations We first consider the classical identifiability conditions for single equations or a set of equations. These will be expressed in terms of the equivalence conditions (3.11) and (3.12). From Definitions 2.2 and 2.4, we can define the identification ‘The word “admissible” has a different meaning in statistical decision theory. Its use in these two different contexts should not be confused. C. Hsiao 232 of the g th equation and the complete system as follows: 4 Definition 3.3.1 The gth equation is identified if and only if all equivalent structures are related by admissible transformations which are trivial with respect to the gth equation. That is, all admissible transformation matrix should have the form \ F= I o o fgg o o \ I gth column Definition 3.3.2 gth row. (3.13) A system of equations is identified if and only if all equations in the system are identified. That is. the admissible transformation matrix has the form F= f,, 0 . . 0 0 f22 0 . 0 0 . 00 -f,, (3.14) When normalization conditions are imposed, such as setting the diagonal elements of B equal to 1 or constraining the variance of each equation to one, we constrain a structure So = 1~’ from a ray to a point; then fgg in Definitions 3.3.1 and 3.3.2 will be equal to 1. Let A = [B, r], and a; be the gth row of A. We assume that there exists an (G + K) X R, matrix c#$ and a 1 X R, vector di whose elements are known constants, such that all prior informatron about the gth equation of the model including the normalization rule takes the form a;+; = d;. (3.15) 4Strictly speaking, Definitions 3.3.1 and 3.3.2 are theorems derived from Definitions 2.2 and 2.4. However, the proof of these theorems is trivial and sheds no light on the problem. For simplicity of exposition, we treat them as definitions. [...]... theory of estimable functions” [Richmond (1974)] The approach yields rather simple proofs of theorems on the identifiability of individual parameters, thus providing more general results than the conditions for single equation identification described above The discussion is carried out at the level of the whole system of equations; that is to say, it is not confined to restrictions on the parameters of. .. rank( A+; ) = (3.21) 3.4 Identification in terms of “linear estimable functions” We can also approach the identification problem by considering the reduced form (3.7) Assumptions 3.1-3.3 are sufficient to identify II We therefore treat the elements of II as known and investigate the conditions which will allow some or all of the elements of B and r to be uniquely determined Ch 4: Identification 235 We... case, we now turn to the question of identifiability of individual parameters or a linear combination of the parameters The problem can be conveniently put in the form of a linear parametric function, which is a linear combination of the structural parameters of the form (‘ where t’ is a 1 x G(G + 8, K) vector with known elements From Definition 2.4 and the identifiability of II, it follows that priori... same purpose of constraining F(L) to a constant matrix F [Hannan (1971)] For instance, in Theorem 4.3.1 we may replace Condition 4.3.2 by Condition 4.3.2’ The maximum degree of each column of [B(L) r(L)] is known a priori and there exist G columns of [B(L) r(L)] such that the matrix of coefficient vectors corresponding to the maximum degrees of these G columns has rank G Similarly, instead of zero restrictions... g th row of A may satisfy a)$; = d,, (4.24) where +s and d, are an R_sX [G( p + I)+ K(q + I)] matrix and an R, X I vector of known elements, respectively Then we may replace Condition 4.3.3 in Theorem “Here we refer to the identification of a complete model If only certain rows in A satisfy Condition 4.3.3, then the theorem should’ be modified to refer to the identification of those rows of A Similarly... submatrix of A obtained by taking the columns of A with prescribed zeros in the g th row has rankG - 1 Proof Consider two observationally equivalent structures 1 and A We know that there must exist a non-singular F such that A= FA by Lemma 3.2.1 And we wish to show that the gth row of F has zeros everywhere but the gth entry Without loss of generality, let g = 1, so that we are considering the first row of. .. the dynamic equations identified From Section [BP Ch 4: Identification 249 3 we know that one such condition is Condition 4.3.3 Let A= [B, B, BP r, r, l-,] Let at least (G - 1) zeros be prescribed in each row of A and let one element in each row of B, be prescribed as unity Let the rank of each submatrix of A obtained by taking the columns of A with prescribed zeros in a certain row be G-l Then we... condition holds Condition 4.3.3 “’ Let A(L) = [B(L) r(L)] Let at least (G - 1) zeros be prescribed in each row of A(L) and let one element in each row of R,, be prescribed as unity For each row, the matrix consisting of the columns of A(L) having prescribed null elements in that row is of rank(G - 1) Proof The sufficiency follows from the fact that under Condition 4.3.3 “’ F(L) must be diagonal Condition 4.3.1... apart from the redundancy in each equation I2 In this the identification of a model is achieved when the only admissible transformation F(L) is a G x G non-singular diagonal matrix of rational functions of L Condition 4.3.3 “’ is then necessary and sufficient for the identification of model (4.1) with Assumptions 4.2-4.5 For other identification conditions without the relatively left prime assumption,... Zc, when B(L) is lower triangular with diagonal elements of B, equal to unity and the maximum degree of the off-diagonal elements b,,(L) less than the maximum degree of the diagonal elements b,,(L) for g > I, the system is identified When identities are present, this is equivalent to knowing that certain elements of the variance-covariance matrix of u, are zero Suppose that the last (G - G,) equations

Ngày đăng: 23/07/2014, 10:06

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan