Handbook of Economic Forecasting part 32 pptx

284 V. Corradi and N.R. Swanson Storey, J.D. (2003). “The positive false discovery rate: A Bayesian interpretation and the q-value”. Annals of Statistics 31, 2013–2035. Sullivan, R., Timmermann, A., White, H. (1999). “Data-snooping, technical trading rule performance, and the bootstrap”. Journal of Finance 54, 1647–1691. Sullivan, R., Timmermann, A., White, H. (2001). “Dangers of data-mining: The case of calendar effects in stock returns”. Journal of Econometrics 105, 249–286. Swanson, N.R., White, H. (1997). “A model selection approach to real-time macroeconomic forecasting using linear models and artificial neural networks”. Review of Economic Statistics 59, 540–550. Teräsvirta, T. (2006). “Forecasting economic variables with nonlinear models”. In: Elliott, G., Granger, C.W.J., Timmermann, A. (Eds.), Handbook of Economic Forecasting. Elsevier, Amsterdam, pp. 413–457. Chapter 8 in this volume. Thompson, S.B. (2002). “Evaluating the goodness of fit of conditional distributions, with an application to affine term structure models”. Working Paper, Harvard University. van der Vaart, A.W. (1998). Asymptotic Statistics. Cambridge, New York. Vuong, Q. (1989). “Likelihood ratio tests for model selection and non-nested hypotheses”. Econometrica 57, 307–333. Weiss, A.(1996). “Estimating time series models using the relevant cost function”. Journal of Applied Econo- metrics 11, 539–560. West, K.D. (1996). “Asymptotic inference about predictive ability”. Econometrica 64, 1067–1084. West, K.D. (2006). “Forecast evaluation”. In: Elliott, G., Granger, C.W.J., Timmermann, A. (Eds.), Handbook of Economic Forecasting. Elsevier, Amsterdam, pp. 99–134. Chapter 3 in this volume. West, K.D., McCracken, M.W. (1998). “Regression-based tests for predictive ability”. International Economic Review 39, 817–840. Whang, Y.J. (2000). “Consistent bootstrap tests of parametric regression functions”. Journal of Econometrics, 27–46. Whang, Y.J. (2001). “Consistent specification testing for conditional moment restrictions”. Economics Let- ters 71, 299–306. White, H. (1982). “Maximum likelihood estimation of misspecified models”. Econometrica 50, 1–25. White, H. (1994). Estimation, Inference and Specification Analysis. Cambridge University Press, Cambridge. White, H. (2000). “A reality check for data snooping”. Econometrica 68, 1097–1126. Wooldridge, J.M. (2002). Econometric Analysis of Cross Section and Panel Data. MIT Press, Cambridge. Zheng, J.X. (2000). “A consistent test of conditional parametric distribution”. Econometric Theory 16, 667– 691. PART 2 FORECASTING MODELS This page intentionally left blank Chapter 6 FORECASTING WITH VARMA MODELS HELMUT LÜTKEPOHL Department of Economics, European University Institute, Via della Piazzuola 43, I-50133 Firenze, Italy e-mail: helmut.luetkepohl@iue.it Contents Abstract 288 Keywords 288 1. Introduction and overview 289 1.1. Historical notes 290 1.2. Notation, terminology, abbreviations 291 2. VARMA processes 292 2.1. Stationary processes 292 2.2. Cointegrated I(1) processes 294 2.3. Linear transformations of VARMA processes 294 2.4. Forecasting 296 2.4.1. General results 296 2.4.2. Forecasting aggregated processes 299 2.5. Extensions 305 2.5.1. Deterministic terms 305 2.5.2. More unit roots 305 2.5.3. Non-Gaussian processes 306 3. Specifying and estimating VARMA models 306 3.1. The echelon form 306 3.1.1. Stationary processes 307 3.1.2. I(1) processes 309 3.2. Estimation of VARMA models for given lag orders and cointegrating rank 311 3.2.1. ARMA E models 311 3.2.2. EC-ARMA E models 312 3.3. Testing for the cointegrating rank 313 3.4. Specifying the lag orders and Kronecker indices 314 3.5. Diagnostic checking 316 4. Forecasting with estimated processes 316 4.1. General results 316 Handbook of Economic Forecasting, Volume 1 Edited by Graham Elliott, Clive W.J. Granger and Allan Timmermann © 2006 Elsevier B.V. All rights reserved DOI: 10.1016/S1574-0706(05)01006-2 288 H. Lütkepohl 4.2. Aggregated processes 318 5. Conclusions 319 Acknowledgements 321 References 321 Abstract Vector autoregressive moving-average (VARMA) processes are suitable models for producing linear forecasts of sets of time series variables. They provide parsimonious representations of linear data generation processes. The setup for these processes in the presence of stationary and cointegrated variables is considered. Moreover, unique or identified parameterizations based on the echelon form are presented. Model specification, estimation, model checking and forecasting are discussed. Special attention is paid to forecasting issues related to contemporaneously and temporally aggregated VARMA processes. Predictors for aggregated variables based alternatively on past information in the aggregated variables or on disaggregated information are compared. Keywords echelon form, Kronecker indices, model selection, vector autoregressive process, vector error correction model, cointegration JEL classification:C32 Ch. 6: Forecasting with VARMA Models 289 1. Introduction and overview In this chapter linear models for the conditional mean of a stochastic process are considered. These models are useful for producing linear forecasts of time series variables. Even if nonlinear features may be present in a given series and, hence, nonlinear forecasts are considered, linear forecasts can serve as a useful benchmark against which other forecasts may be evaluated. As pointed out by Teräsvirta (2006) in this Hand- book, Chapter 8, they may be more robust than nonlinear forecasts. Therefore, in this chapter linear forecasting models and methods will be discussed. Suppose that K related time series variables are considered, y 1t , ,y Kt , say. Defin- ing y t = (y 1t , ,y Kt )  , a linear model for the conditional mean of the data generation process (DGP) of the observed series may be of the vector autoregressive (VAR) form, (1.1)y t = A 1 y t−1 +···+A p y t−p + u t , where the A i ’s (i = 1, ,p) are (K × K) coefficient matrices and u t is a K-dimensional error term. If u t is independent over time (i.e., u t and u s are independent for t = s), the conditional mean of y t , given past observations, is y t|t−1 ≡ E(y t |y t−1 ,y t−2 , )= A 1 y t−1 +···+A p y t−p . Thus, the model can be used directly for forecasting one period ahead and forecasts with larger horizons can be computed recursively. Therefore, variants of this model will be the basic forecasting models in this chapter. For practical purposes the simple VAR model of order p may have some disadvan- tages, however. The A i parameter matrices will be unknown and have to be replaced by estimators. For an adequate representation of the DGP of a set of time series of interest a rather large VAR order p may be required. Hence, a large number of parameters may be necessary for an adequate description of the data. Given limited sample information this will usually result in low estimation precision and also forecasts based on VAR processes with estimated coefficients may suffer from the uncertainty in the parameter estimators. Therefore it is useful to consider the larger model class of vector autoregressive moving-average (VARMA) models which may be able to represent the DGP of interest in a more parsimonious way because they represent a wider model class to choose from. In this chapter the analysis of models from that class will be discussed although special case results for VAR processes will occasionally be noted explicitly. Of course, this framework includes univariate autoregressive (AR) and autoregressive moving-average (ARMA) processes. In particular, for univariate series the advantages of mixed ARMA models over pure finite order AR models for forecasting was found in early studies [e.g., Newbold and Granger (1974)]. The VARMA framework also includes the class of unobserved component models discussed by Harvey (2006) in this Handbook who argues that these models forecast well in many situations. The VARMA class has the further advantage of being closed with respect to linear transformations, that is, a linearly transformed finite order VARMA process has again a finite order VARMA representation. Therefore linear aggregation issues can be studied 290 H. Lütkepohl within this class. In this chapter special attention will be given to results related to forecasting contemporaneously and temporally aggregated processes. VARMA models can be parameterized in different ways. In other words, different parameterizations describe the same stochastic process. Although this is no problem for forecasting purposes because we just need to have one adequate representation of the DGP, nonunique parameters are a problem at the estimation stage. Therefore the echelon form of a VARMA process is presented as a unique representation. Estimation and specification of this model form will be considered. These models have first been developed for stationary variables. In economics and also other fields of applications many variables are generated by nonstationary processes, however. Often they can be made stationary by considering differences or changes rather than the levels. A variable is called integrated of order d(I(d))if it is still nonstationary after taking differences d − 1 times but it can be made stationary or asymptotically stationary by differencing d times. In most of the following discussion the variables will be assumed to be stationary (I (0)) or integrated of order 1 (I (1)) and they may be cointegrated. In other words, there may be linear combinations of I(1) variables which are I(0). If cointegration is present, it is often advantageous to separate the cointegration relations from the short-run dynamics of the DGP. This can be done conveniently by allowing for an error correction or equilibrium correction (EC) term in the models and EC echelon forms will also be considered. The model setup for stationary and integrated or cointegrated variables will be presented in the next section where also forecasting with VARMA models will be considered under the assumption that the DGP is known. In practice it is, of course, necessary to specify and estimate a model for the DGP on the basis of a given set of time series. Model specification, estimation and model checking are discussed in Section 3 and forecasting with estimated models is considered in Section 4. Conclusions follow in Section 5. 1.1. Historical notes The successful use of univariate ARMA models for forecasting has motivated re- searchers to extend the model class to the multivariate case. It is plausible to expect that using more information by including more interrelated variables in the model improves the forecast precision. This is actually the idea underlying Granger’s influential defini- tion of causality [Granger (1969a)]. It turned out, however, that generalizing univariate models to multivariate ones is far from trivial in the ARMA case. Early on Quenouille (1957) considered multivariate VARMA models. It became quickly apparent, however, that the specification and estimation of such models was much more difficult than for univariate ARMA models. The success of the Box–Jenkins modelling strategy for univariate ARMA models in the 1970s [Box and Jenkins (1976), Newbold and Granger (1974), Granger and Newbold (1977, Section 5.6)] triggered further attempts of using the corresponding multivariate models and developing estimation and specification strategies. In particular, the possibility of using autocorrelations, partial autocorrelations Ch. 6: Forecasting with VARMA Models 291 and cross-correlations between the variables for model specification was explored. Be- cause modelling strategies based on such quantities had been to some extent successful in the univariate Box–Jenkins approach, it was plausible to try multivariate extensions. Examples of such attempts are Tiao and Box (1981), Tiao and Tsay (1983, 1989), Tsay (1989a, 1989b), Wallis (1977), Zellner and Palm (1974), Granger and Newbold (1977, Chapter 7), Jenkins and Alavi (1981). It became soon clear, however, that these strategies were at best promising for very small systems of two or perhaps three variables. Moreover, the most useful setup of multiple time series models was under discussion because VARMA representations are not unique or, to use econometric terminology, they are not identified. Important early discussions of the related problems are due to Hannan (1970, 1976, 1979, 1981), Dunsmuir and Hannan (1976) and Akaike (1974). A rather general solution to the structure theory for VARMA models was later presented by Hannan and Deistler (1988). Understanding the structural problems contributed to the development of complete specification strategies. By now textbook treatments of modelling, analyzing and forecasting VARMA processes are available [Lütkepohl (2005), Reinsel (1993)]. The problems related to VARMA models were perhaps also relevant for a parallel development of pure VAR models as important tools for economic analysis and forecasting. Sims (1980) launched a general critique of classical econometric modelling and proposed VAR models as alternatives. A short while later the concept of cointegration was developed by Granger (1981) and Engle and Granger (1987). It is conveniently placed into the VAR framework as shown by the latter authors and Johansen (1995a). Therefore it is perhaps not surprising that VAR models dominate time series econometrics although the methodology and software for working with more general VARMA models is nowadays available. A recent previous overview of forecasting with VARMA processes is given by Lütkepohl (2002). The present review draws partly on that article and on a monograph by Lütkepohl (1987). 1.2. Notation, terminology, abbreviations The following notation and terminology is used in this chapter. The lag operator also sometimes called backshift operator is denoted by L and it is defined as usual by Ly t ≡ y t−1 .Thedifferencing operator is denoted by , that is, y t ≡ y t −y t−1 . For a random variable or random vector x, x ∼ (μ, ) signifies that its mean (vector) is μ and its variance (covariance matrix) is .The(K × K) identity matrix is denoted by I K and the determinant and trace of a matrix A are denoted by det A and tr A, respectively. For quantities A 1 , ,A p ,diag[A 1 , ,A p ]denotes the diagonal or block-diagonal matrix with A 1 , ,A p on the diagonal. The natural logarithm of a real number is signified by log. The symbols Z, N and C are used for the integers, the positive integers and the complex numbers, respectively. DGP stands for data generation process. VAR, AR, MA, ARMA and VARMA are used as abbreviations for vector autoregressive, autoregressive, moving-average, autoregressive moving-average and vector autoregressive moving-average (process). Error 292 H. Lütkepohl correction is abbreviated as EC and VECM is short for vector error correction model. The echelon forms of VARMA and EC-VARMA processes are denoted by ARMA E and EC-ARMA E , respectively. OLS, GLS, ML and RR abbreviate ordinary least squares, generalized least squares, maximum likelihood and reduced rank, respectively. LR and MSE are used to abbreviate likelihood ratio and mean squared error. 2. VARMA processes 2.1. Stationary processes Suppose the DGP of the K-dimensional multiple time series, y 1 , ,y T , is stationary, that is, its first and second moments are time invariant. It is a (finite order) VARMA process if it can be represented in the general form A 0 y t = A 1 y t−1 +···+A p y t−p + M 0 u t + M 1 u t−1 +···+M q u t−q , (2.1)t = 0, ±1, ±2, , where A 0 ,A 1 , ,A p are (K × K) autoregressive parameter matrices while M 0 , M 1 , ,M q are moving-average parameter matrices also of dimension (K ×K). Defin- ing the VAR and MA operators, respectively, as A(L) = A 0 − A 1 L −···−A p L p and M(L) = M 0 +M 1 L +···+M q L q , the model can be written in more compact notation as (2.2)A(L)y t = M(L)u t ,t∈ Z. Here u t is a white-noise process with zero mean, nonsingular, time-invariant covariance matrix E(u t u  t ) =  u and zero covariances, E(u t u  t−h ) = 0forh =±1, ±2, The zero-order matrices A 0 and M 0 are assumed to be nonsingular. They will often be identical, A 0 = M 0 , and in many cases they will be equal to the identity matrix, A 0 = M 0 = I K . To indicate the orders of the VAR and MA operators, the process (2.1) is sometimes called a VARMA(p, q) process. Notice, however, that so far we have not made further assumptions regarding the parameter matrices so that some or all of the elements of the A i ’s and M j ’s may be zero. In other words, there may be a VARMA representation with VAR or MA orders less than p and q, respectively. Obviously, the VARmodel(1.1) is a VARMA(p, 0) special case with A 0 = I K and M(L) = I K .Itmay also be worth pointing out that there are no deterministic terms such as nonzero mean terms in our basic VARMA model (2.1). These terms are ignored here for convenience although they are important in practice. The necessary modifications for deterministic terms will be discussed in Section 2.5. The matrix polynomials in (2.2) areassumedtosatisfy (2.3)det A(z) = 0, |z|  1, and detM(z) = 0, |z|  1forz ∈ C. Ch. 6: Forecasting with VARMA Models 293 The first of these conditions ensures that the VAR operator is stable and the process is stationary. Then it has a pure MA representation (2.4)y t = ∞  j=0  i u t−i with MA operator (L) =  0 +  ∞ i=1  i L i = A(L) −1 M(L). Notice that  0 = I K if A 0 = M 0 and in particular if both zero order matrices are identity matrices. In that case (2.4) is just the Wold MA representation of the process and, as we will see later, the u t are just the one-step ahead forecast errors. Some of the forthcoming results are valid for more general stationary processes with Wold representation (2.4) which may not come from a finite order VARMA representation. In that case, it is assumed that the  i ’s are absolutely summable so that the infinite sum in (2.4) is well-defined. The second part of condition (2.3) is the usual invertibility condition for the MA operator which implies the existence of a pure VAR representation of the process, (2.5)y t = ∞  i=1  i y t−i + u t , where A 0 = M 0 is assumed and (L) = I K −  ∞ i=1  i L i = M(L) −1 A(L). Occa- sionally invertibility of the MA operator will not be a necessary condition. In that case, it is assumed without loss of generality that det M(z) = 0, for |z| < 1. In other words, the roots of the MA operator are outside or on the unit circle. There are still no roots inside the unit circle, however. This assumption can be made without loss of generality because it can be shown that for an MA process with roots inside the complex unit circle an equivalent one exists which has all its roots outside and on the unit circle. It may be worth noting at this stage already that every pair of operators A(L), M(L) which leads to the same transfer functions (L) and (L) defines an equivalent VARMA representation for y t . This nonuniqueness problem of the VARMA representation will become important when parameter estimation is discussed in Section 3. As specified in (2.1), we are assuming that the process is defined for all t ∈ Z. For stable, stationary processes this assumption is convenient because it avoids considering issues related to initial conditions. Alternatively, one could define y t to be generated by a VARMA process such as (2.1) for t ∈ N, and specify the initial values y 0 , ,y −p+1 ,u 0 , ,u −q+1 separately. Under our assumptions they can be defined such that y t is stationary. Another possibility would be to define fixed initial values or perhaps even y 0 =···=y −p+1 = u 0 =···=u −q+1 = 0. In general, such an assumption implies that the process is not stationary but just asymptotically stationary, that is, the first and second order moments converge to the corresponding quantities of the stationary process obtained by specifying the initial conditions accordingly or defining y t for t ∈ Z. The issue of defining initial values properly becomes more important for the nonstationary processes discussed in Section 2.2. Both the MA and the VAR representations of the process will be convenient to work with in particular situations. Another useful representation of a stationary VARMA . Review of Economic Statistics 59, 540–550. Teräsvirta, T. (2006). Forecasting economic variables with nonlinear models”. In: Elliott, G., Granger, C.W.J., Timmermann, A. (Eds.), Handbook of Economic. representation of the DGP of a set of time series of interest a rather large VAR order p may be required. Hence, a large number of parameters may be necessary for an adequate description of the data Kronecker indices 314 3.5. Diagnostic checking 316 4. Forecasting with estimated processes 316 4.1. General results 316 Handbook of Economic Forecasting, Volume 1 Edited by Graham Elliott, Clive

Định dạng
Số trang	10
Dung lượng	94,02 KB