454 T. Teräsvirta Dacco, R., Satchell, S. (1999). “Why do regime-switching models forecast so badly?”. Journal of Forecast- ing 18, 1–16. Davies, R.B. (1977). “Hypothesis testing when a nuisance parameter is present only under the alternative”. Biometrika 64, 247–254. De Gooijer, J.G., De Bruin, P.T. (1998). “On forecasting SETAR processes”. Statistics and Probability Let- ters 37, 7–14. De Gooijer, J.G., Vidiella-i-Anguera, A. (2004). “Forecasting threshold cointegrated systems”. International Journal of Forecasting 20, 237–253. Deutsch, M., Granger, C.W.J., Teräsvirta, T. (1994). “The combination of forecasts using changing weights”. International Journal of Forecasting 10, 47–57. Diebold, F.X., Mariano, R.S. (1995). “Comparing predictive accuracy”. Journal of Business and Economic Statistics 13, 253–263. Eitrheim, Ø., Teräsvirta, T. (1996). “Testing the adequacy of smooth transition autoregressive models”. Jour- nal of Econometrics 74, 59–75. Elliott, G. (2006). “Forecasting with trending data”. In: Elliott, G., Granger, C.W.J., Timmermann, A. (Eds.), Handbook of Economic Forecasting. Elsevier, Amsterdam, pp. 555–603. Chapter 11 in this volume. Enders, W., Granger, C.W.J. (1998). “Unit-root tests and asymmetric adjustment with an example using the term structure of interest rates”. Journal of Business and Economic Statistics 16, 304–311. Fan, J., Yao, Q. (2003). Nonlinear Time Series. Nonparametric and Parametric Methods. Springer, New York. Fine, T.L. (1999). Feedforward Neural Network Methodology. Springer, Berlin. Franses, P.H., van Dijk, D. (2000). Non-Linear Time Series Models in Empirical Finance. Cambridge Uni- versity Press, Cambridge. Friedman, J.H., Stuetzle, W. (1981). “Projection pursuit regression”. Journal of the American Statistical As- sociation 76, 817–823. Funahashi, K. (1989). “On the approximate realization of continuous mappings by neural networks”. Neural Networks 2, 183–192. Garcia, R. (1998). “Asymptotic null distribution of the likelihood ratio test in Markov switching models”. International Economic Review 39, 763–788. Giacomini, R., White, H. (2003). “Tests of conditional predictive ability”. Working Paper 2003-09, Depart- ment of Economics, University of California, San Diego. Goffe, W.L., Ferrier, G.D., Rogers, J. (1994). “Global optimization of statistical functions with simulated annealing”. Journal of Econometrics 60, 65–99. Gonzalo, J., Pitarakis, J Y. (2002). “Estimation and model selection based inference in single and multiple threshold models”. Journal of Econometrics 110, 319–352. Granger, C.W.J., Bates, J. (1969). “The combination of forecasts”. Operations Research Quarterly 20, 451– 468. Granger, C.W.J., Jeon, Y. (2004). “Thick modeling”. Economic Modelling 21, 323–343. Granger, C.W.J., Machina, M.J. (2006). “Forecasting and decision theory”. In: Elliott, G., Granger, C.W.J., Timmermann, A. (Eds.), Handbook of Economic Forecasting. Elsevier, Amsterdam, pp. 81–98. Chapter 2 in this volume. Granger, C.W.J., Pesaran, M.H. (2000). “Economic and statistical measures of forecast accuracy”. Journal of Forecasting 19, 537–560. Granger, C.W.J., Teräsvirta, T. (1991). “Experiments in modeling nonlinear relationships between time se- ries”. In: Casdagli, M., Eubank, S. (Eds.), Nonlinear Modeling and Forecasting. Addison-Wesley, Red- wood City, pp. 189–197. Granger, C.W.J., Teräsvirta, T. (1993). Modelling Nonlinear Economic Relationships. Oxford University Press, Oxford. Haggan, V., Ozaki, T. (1981). “Modelling non-linear random vibrations using an amplitude-dependent au- toregressive time series model”. Biometrika 68, 189–196. Hamilton, J.D. (1989). “A new approach to the economic analysis of nonstationary time series and the business cycle”. Econometrica 57, 357–384. Ch. 8: Forecasting Economic Variables with Nonlinear Models 455 Hamilton, J.D. (1993). “Estimation, inference and forecasting of time series subject to changes in regime”. In: Maddala, G.S., Rao, C.R., Vinod, H.R. (Eds.), Handbook of Statistics, vol. 11. Elsevier, Amsterdam, pp. 231–260. Hamilton, J.D. (1994). Time Series Analysis. Princeton University Press, Princeton, NJ. Hamilton, J.D. (1996). “Specification testing in Markov-switching time-series models”. Journal of Economet- rics 70, 127–157. Hansen, B.E. (1996). “Inference when a nuisance parameter is not identified under the null hypothesis”. Econometrica 64, 413–430. Hansen, B.E. (1999). “Testing for linearity”. Journal of Economic Surveys 13, 551–576. Harvey, A.C. (2006). “Forecasting with unobserved components time series models”. In: Elliott, G., Granger, C.W.J., Timmermann, A. (Eds.), Handbook of Economic Forecasting. Elsevier, Amsterdam. Chapter 7 in this volume. Harvey, D., Leybourne, S., Newbold, P. (1997). “Testing the equality of prediction mean squared errors”. International Journal of Forecasting 13, 281–291. Haykin, S. (1999). Neural Networks. A Comprehensive Foundation, Second ed. Prentice-Hall, Upper Saddle River, NJ. Hendry, D.F., Clements, M.P. (2003). “Economic forecasting: Some lessons from recent research”. Economic Modelling 20, 301–329. Henry, O.T., Olekalns, N., Summers, P.M. (2001). “Exchange rate instability: A threshold autoregressive approach”. Economic Record 77, 160–166. Hornik, K., Stinchcombe, M., White, H. (1989). “Multi-layer feedforward networks are universal approxima- tors”. Neural Networks 2, 359–366. Hwang, J.T.G., Ding, A.A. (1997). “Prediction intervals for artificial neural networks”. Journal of the Ameri- can Statistical Association 92, 109–125. Hyndman, R.J. (1996). “Computing and graphing highest density regions”. The American Statistician 50, 120–126. Inoue, A., Kilian, L. (2004). “In-sample or out-of-sample tests of predictability: Which one should we use?”. Econometric Reviews 23, 371–402. Kilian, L., Taylor, M.P. (2003). “Why is it so difficult to beat the random walk forecast of exchange rates?”. Journal of International Economics 60, 85–107. Lanne, M., Saikkonen, P. (2002). “Threshold autoregressions for strongly autocorrelated time series”. Journal of Business and Economic Statistics 20, 282–289. Lee, T H., White, H., Granger, C.W.J. (1993). “Testing for neglected nonlinearity in time series models: A comparison of neural network methods and alternative tests”. Journal of Econometrics 56, 269–290. Li, H., Xu, Y. (2002). “Short rate dynamics and regime shifts”. Working Paper, Johnson Graduate School of Management, Cornell University. Lin, C F., Teräsvirta, T. (1999). “Testing parameter constancy in linear models against stochastic stationary parameters”. Journal of Econometrics 90, 193–213. Lin, J L., Granger, C.W.J. (1994). “Forecasting from non-linear models in practice”. Journal of Forecast- ing 13, 1–9. Lindgren, G. (1978). “Markov regime models for mixed distributions and switching regressions”. Scandina- vian Journal of Statistics 5, 81–91. Lundbergh, S., Teräsvirta, T. (2002). “Forecasting with smooth transition autoregressive models”. In: Clements, M.P., Hendry, D.F. (Eds.), A Companion to Economic Forecasting. Blackwell, Oxford, pp. 485–509. Luukkonen, R., Saikkonen, P., Teräsvirta, T. (1988). “Testing linearity against smooth transition autoregres- sive models”. Biometrika 75, 491–499. Maddala, D.S. (1977). Econometrics. McGraw-Hill, New York. Marcellino, M. (2002). “Instability and non-linearity in the EMU”. Discussion Paper No. 3312, Centre for Economic Policy Research. Marcellino, M. (2004). “Forecasting EMU macroeconomic variables”. International Journal of Forecast- ing 20, 359–372. 456 T. Teräsvirta Marcellino, M., Stock, J.H., Watson, M.W. (2004). “A comparison of direct and iterated multistep AR methods for forecasting economic time series”. Working Paper. Medeiros, M.C., Teräsvirta, T., Rech, G. (2006). “Building neural network models for time series: A statistical approach”. Journal of Forecasting 25, 49–75. Mincer, J., Zarnowitz, V. (1969). “The evaluation of economic forecasts”. In: Mincer, J. (Ed.), Economic Forecasts and Expectations. National Bureau of Economic Research, New York. Montgomery, A.L., Zarnowitz, V., Tsay, R.S., Tiao, G.C. (1998). “Forecasting the U.S. unemployment rate”. Journal of the American Statistical Association 93, 478–493. Nyblom, J. (1989). “Testing for the constancy of parameters over time”. Journal of the American Statistical Association 84, 223–230. Pesaran, M.H., Timmermann, A. (2002). “Model instability and choice of observation window”. Working Paper. Pfann, G.A., Schotman, P.C., Tschernig, R. (1996). “Nonlinear interest rate dynamics and implications for term structure”. Journal of Econometrics 74, 149–176. Poon, S.H., Granger, C.W.J. (2003). “Forecasting volatility in financial markets”. Journal of Economic Liter- ature 41, 478–539. Proietti, T. (2003). “Forecasting the US unemployment rate”. Computational Statistics and Data Analysis 42, 451–476. Psaradakis, Z., Spagnolo, F. (2005). “Forecast performance of nonlinear error-correction models with multiple regimes”. Journal of Forecasting 24, 119–138. Ramsey, J.B. (1996). “If nonlinear models cannot forecast, what use are they?”. Studies in Nonlinear Dynam- ics and Forecasting 1, 65–86. Sarantis, N. (1999). “Modelling non-linearities in real effective exchange rates”. Journal of International Money and Finance 18, 27–45. Satchell, S., Timmermann, A. (1995). “An assessment of the economic value of non-linear foreign exchange rate forecasts”. Journal of Forecasting 14, 477–497. Siliverstovs, B., van Dijk, D. (2003). “Forecasting industrial production with linear, nonlinear, and structural change models”. Econometric Institute Report EI 2003-16, Erasmus University Rotterdam. Skalin, J., Teräsvirta, T. (2002). “Modeling asymmetries and moving equilibria in unemployment rates”. Macroeconomic Dynamics 6, 202–241. Stock, J.H., Watson, M.W. (1999). “A comparison of linear and nonlinear univariate models for forecasting macroeconomic time series”. In: Engle, R.F., White, H. (Eds.), Cointegration, Causality and Forecasting. A Festschrift in Honour of Clive W.J. Granger. Oxford University Press, Oxford, pp. 1–44. Strikholm, B., Teräsvirta, T. (2005). “Determining the number of regimes in a threshold autoregressive model using smooth transition autoregressions”. Working Paper 578, Stockholm School of Economics. Swanson, N.R., White, H. (1995). “A model-selection approach to assessing the information in the term struc- ture using linear models and artificial neural networks”. Journal of Business and Economic Statistics 13, 265–275. Swanson, N.R., White, H. (1997a). “Forecasting economic time series using flexible versus fixed specification and linear versus nonlinear econometric models”. International Journal of Forecasting 13, 439–461. Swanson, N.R., White, H. (1997b). “A model selection approach to real-time macroeconomic forecasting using linear models and artificial neural networks”. Review of Economic and Statistics 79, 540–550. Tay, A.S., Wallis, K.F. (2002). “Density forecasting: A survey”. In: Clements, M.P., Hendry, D.F. (Eds.), A Companion to Economic Forecasting. Blackwell, Oxford, pp. 45–68. Taylor, M.P., Sarno, L. (2002). “Purchasing power parity and the real exchange rate”. International Monetary Fund Staff Papers 49, 65–105. Teräsvirta, T. (1994). “Specification, estimation, and evaluation of smooth transition autoregressive models”. Journal of the American Statistical Association 89, 208–218. Teräsvirta, T. (1998). “Modeling economic relationships with smooth transition regressions”. In: Ullah, A., Giles, D.E. (Eds.), Handbook of Applied Economic Statistics. Dekker, New York, pp. 507–552. Teräsvirta, T. (2004). “Nonlinear smooth transition modeling”. In: Lütkepohl, H., Krätzig, M. (Eds.), Applied Time Series Econometrics. Cambridge University Press, Cambridge, pp. 222–242. Ch. 8: Forecasting Economic Variables with Nonlinear Models 457 Teräsvirta, T., Anderson, H.M. (1992). “Characterizing nonlinearities in business cycles using smooth transi- tion autoregressive models”. Journal of Applied Econometrics 7, S119–S136. Teräsvirta, T., Eliasson, A C. (2001). “Non-linear error correction and the UK demand for broad money, 1878–1993”. Journal of Applied Econometrics 16, 277–288. Teräsvirta, T., Lin, C F., Granger, C.W.J. (1993). “Power of the neural network linearity test”. Journal of Time Series Analysis 14, 309–323. Teräsvirta, T., van Dijk, D., Medeiros, M.C. (2005). “Smooth transition autoregressions, neural networks, and linear models in forecasting macroeconomic time series: A re-examination”. International Journal of Forecasting 21, 755–774. Timmermann, A. (2006). “Forecast combinations”. In: Elliott, G., Granger, C.W.J., Timmermann, A. (Eds.), Handbook of Economic Forecasting. Elsevier, Amsterdam, pp. 135–196. Chapter 4 in this volume. Tong, H. (1990). Non-Linear Time Series. A Dynamical System Approach. Oxford University Press, Oxford. Tong, H., Moeanaddin, R. (1988). “On multi-step nonlinear least squares prediction”. The Statistician 37, 101–110. Tsay, R.S. (2002). “Nonlinear models and forecasting”. In: Clements, M.P., Hendry, D.F. (Eds.), A Compan- ion to Economic Forecasting. Blackwell, Oxford, pp. 453–484. Tyssedal, J.S., Tjøstheim, D. (1988). “An autoregressive model with suddenly changing parameters”. Applied Statistics 37, 353–369. van Dijk, D., Teräsvirta, T., Franses, P.H. (2002). “Smooth transition autoregressive models – a survey of recent developments”. Econometric Reviews 21, 1–47. Venetis, I.A., Paya, I., Peel, D.A. (2003). “Re-examination of the predictability of economic activity using the yield spread: A nonlinear approach”. International Review of Economics and Finance 12, 187–206. Wallis, K.F. (1999). “Asymmetric density forecasts of inflation and the Bank of England’s fan chart”. National Institute Economic Review 167, 106–112. Watson, M.W., Engle, R.F. (1985). “Testing for regression coefficient stability with a stationary AR(1) alter- native”. Review of Economics and Statistics 67, 341–346. Wecker, W.E. (1981). “Asymmetric time series”. Journal of the American Statistical Association 76, 16–21. West, K.D. (2006). “Forecast evaluation”. In: Elliott, G., Granger, C.W.J., Timmermann, A. (Eds.), Handbook of Economic Forecasting. Elsevier, Amsterdam, pp. 99–134. Chapter 3 in this volume. White, H. (1990). “Connectionist nonparametric regression: Multilayer feedforward networks can learn arbi- trary mappings”. Neural Networks 3, 535–550. White, H. (2006). “Approximate nonlinear forecasting methods”. In: Elliott, G., Granger, C.W.J., Timmer- mann, A. (Eds.), Handbook of Economic Forecasting. Elsevier, Amsterdam, pp. 459–512. Chapter 9 in this volume. Zhang, G., Patuwo, B.E., Hu, M.Y. (1998). “Forecasting with artificial neural networks: The state of the art”. International Journal of Forecasting 14, 35–62. This page intentionally left blank Chapter 9 APPROXIMATE NONLINEAR FORECASTING METHODS HALBERT WHITE Department of Economics, UC San Diego Contents Abstract 460 Keywords 460 1. Introduction 461 2. Linearity and nonlinearity 463 2.1. Linearity 463 2.2. Nonlinearity 466 3. Linear, nonlinear, and highly nonlinear approximation 467 4. Artificial neural networks 474 4.1. General considerations 474 4.2. Generically comprehensively revealing activation functions 475 5. QuickNet 476 5.1. A prototype QuickNet algorithm 477 5.2. Constructing Γ m 479 5.3. Controlling overfit 480 6. Interpretational issues 484 6.1. Interpreting approximation-based forecasts 485 6.2. Explaining remarkable forecast outcomes 485 6.2.1. Population-based forecast explanation 486 6.2.2. Sample-based forecast explanation 488 6.3. Explaining adverse forecast outcomes 490 7. Empirical examples 492 7.1. Estimating nonlinear forecasting models 492 7.2. Explaining forecast outcomes 505 8. Summary and concluding remarks 509 Acknowledgements 510 References 510 Handbook of Economic Forecasting, Volume 1 Edited by Graham Elliott, Clive W.J. Granger and Allan Timmermann © 2006 Elsevier B.V. All rights reserved DOI: 10.1016/S1574-0706(05)01009-8 460 H. White Abstract We review key aspects of forecasting using nonlinear models. Because economic mod- els are typically misspecified, the resulting forecasts provide only an approximation to the best possible forecast. Although it is in principle possible to obtain superior approx- imations to the optimal forecast using nonlinear methods, there are some potentially serious practical challenges. Primary among these are computational difficulties, the dangers of overfit, and potential difficulties of interpretation. In this chapter we discuss these issues in detail. Then we propose and illustrate the use of a new family of methods (QuickNet) that achieves the benefits of using a forecasting model that is nonlinear in the predictors while avoiding or mitigating the other challenges to the use of nonlinear forecasting methods. Keywords prediction, misspecification, approximation, nonlinear methods, highly nonlinear methods, artificial neural networks, ridgelets, forecast explanation, model selection, QuickNet JEL classification: C13, C14, C20, C45, C51, C43 Ch. 9: Approximate Nonlinear Forecasting Methods 461 1. Introduction In this chapter we focus on obtaining a point forecast or prediction of a “target variable” Y t given a k × 1 vector of “predictors” X t (with k a finite integer). For simplicity, we take Y t to be a scalar. Typically, X t is known or observed prior to the realization of Y t , so the “t” subscript on X t designates the observation index for which a prediction is to be made, rather than the time period in which X t is first observed. The discussion to follow does not strictly require this time precedence, although we proceed with this convention implicit. Thus, in a typical time-series application, X t may contain lagged values of Y t , as well as values of other variables known prior to time t. Although we use the generic observation index t throughout, it is important to stress that our discussion applies quite broadly, and not just to pure time-series forecasting. An increasingly important use of prediction models involves cross-section or panel data. In these applications, Y t denotes the outcome variable for a generic individual t and X t denotes predictors for the individual’s outcome, observable prior to the outcome. Once the prediction model has been constructed using the available cross-section or panel data, it is then used to evaluate new cases whose outcomes are unknown. For example, banks or other financial institutions now use prediction models exten- sively to forecast whether a new applicant for credit will be a good risk or not. If the prediction is favorable, then credit will be granted; otherwise, the application may be de- nied or referred for further review. These prediction models are built using cross-section or panel data collected by the firm itself and/or purchased from third party vendors. These data sets contain observations on individual attributes X t , corresponding to infor- mation on the application, as well as subsequent outcome information Y t , such as late payment or default. The reader may find it helpful to keep such applications in mind in what follows so as not to fall into the trap of interpreting the following discussion too narrowly. Because of our focus on these broader applications of forecasting, we shall not delve very deeply into the purely time-series aspects of the subject. Fortunately, Chapter 8 in this volume by Teräsvirta (2006) contains an excellent treatment of these issues. In particular, there are a number of interesting and important issues that arise when consid- ering multi-step-ahead time-series forecasts, as opposed to single-step-ahead forecasts. In time-series application of the results here, we implicitly operate with the convention that multi-step forecasts are constructed using the direct approach in which a different forecast model is constructed for each forecast horizon. The reader is urged to consult Teräsvirta’s chapter for a wealth of time-series material complementary to the present chapter. There is a vast array of methods for producing point forecasts, but for convenience, simplicity, and practical relevance we restrict our discussion to point forecasts con- structed as approximations to the conditional expectation (mean) of Y t given X t , μ(X t ) ≡ E(Y t |X t ). 462 H. White It is well known that μ(X t ) provides the best possible prediction of Y t given X t in terms of prediction mean squared error (PMSE), provided Y t has finite variance. That is, the function μ solves the problem (1)min m∈M E Y t − m(X t ) 2 , where M is the collection of functions m of X t having finite variance, and E is the expectation taken with respect to the joint distribution of Y t and X t . By restricting attention to forecasts based on the conditional mean, we neglect fore- casts that arise from the use of loss functions other than PMSE, such as prediction mean absolute error, which yields predictions based on the conditional median, or its asym- metric analogs, which yield predictions based on conditional quantiles [e.g., Koenker and Basset (1978), Kim and White (2003)]. Although we provide no further explicit discussion here, the methods we describe for obtaining PMSE-based forecasts do have immediate analogs for other such important loss functions. Our focus on PMSE leads naturally to methods of least-squares estimation, which underlie the vast majority of forecasting applications, providing our discussion with its intended practical relevance. If μ were known, then we could finish our exposition here in short order: μ provides the PMSE-optimal method for constructing forecasts and that is that. Or, if we knew the conditional distribution of Y t given X t , then μ would again be known, as it can be obtained from this distribution. Typically, however, we do not have this knowledge. Confronted with such ignorance, forecasters typically proceed by specifying a model for μ, that is, a collection M (note our notation above) of functions of X t .Ifμ belongs to M, then we say the model is “correctly specified”. (So, for example, if Y t has finite variance, then the model M of functions m of X t having finite variance is correctly specified, as μ is in fact such a function.) If M is sufficiently restricted that μ does not belong to M, then we say that the model is “misspecified”. Here we adopt the pragmatic view that either out of convenience or ignorance (typ- ically both) we work with a misspecified model for μ. By taking M to be as specified in (1), we can generally avoid misspecification, but this is not necessarily convenient, as the generality of this choice poses special challenges for statistical estimation. (This choice for M leads to nonparametric methods of statistical estimation.) Restricting M leads to more convenient estimation procedures, and it is especially convenient, as we do here, to work with parametric models for μ. Unfortunately, we rarely have enough information about μ to correctly specify a parametric model for it. When one’s goal is to make predictions, the use of a misspecified model is by no means fatal. Our predictions will not be as good as they would be if μ were accessible, but to the extent that we can approximate μ more or less well, then our predictions will still be more or less accurate. As we discuss below, any model M provides us with a means of approximating μ, and it is for this reason that we declared above that our focus will be on “forecasts constructed as approximations” to μ. The challenge then is to choose M suitably, where by “suitably”, we mean in such a way as to conveniently Ch. 9: Approximate Nonlinear Forecasting Methods 463 provide a good approximation to μ. Our discussion to follow elaborates our notions of convenience and goodness of approximation. 2. Linearity and nonlinearity 2.1. Linearity Parametric models are models whose elements are indexed by a finite-dimensional pa- rameter vector. An important and familiar example is the linear parametric model. This model is generated by the function l(x, β) ≡ x β. We call β a “parameter vector”, and, as β conforms with the predictors (represented here by x), we have β belonging to the “parameter space” R k ,k-dimensional real Euclidean space. The linear parametric model is then the collection of functions L ≡ m : R k → R | m(x) = l(x, β) ≡ x β, β ∈ R k . We call the function l the “model parameterization”, or simply the “parameterization”. We see here that each model element l(·,β)of L is a linear function of x. It is standard to set the first element of x to the constant unity, so in fact l(·,β)is an affine function of the nonconstant elements of x. For simplicity, we nevertheless refer to l(·,β)in this context as “linear in x”, and we call forecasts based on a parameterization linear in the predictors a “linear forecast”. For fixed x, the parameterization l(x, ·) is also linear in the parameters. In discussing linearity or nonlinearity of the parameterization (equivalently, of the parametric model), it is important generally to specify to whether one is referring to the predictors x or to the parameters β. Here, however, this doesn’t matter, as we have linearity either way. Solving problem (1) with M = L, that is, solving min m∈L E Y t − m(X t ) 2 , yields l(·,β ∗ ), where (2)β ∗ = arg min β∈R k E Y t − X t β 2 . We call β ∗ the “PMSE-optimal coefficient vector”. This delivers not only the best forecast for Y t given X t based on the linear model L, but also the optimal linear ap- proximation to μ, as discussed by White (1980). To establish this optimal approximation property, observe that E Y t − X t β 2 = E Y t − μ(X t ) + μ(X t ) − X t β 2 = E Y t − μ(X t ) 2 + E μ(X t ) − X t β 2 + 2E Y t − μ(X t ) μ(X t ) − X t β . Depart- ment of Economics, University of California, San Diego. Goffe, W.L., Ferrier, G.D., Rogers, J. (1994). “Global optimization of statistical functions with simulated annealing”. Journal of. (1995). “An assessment of the economic value of non-linear foreign exchange rate forecasts”. Journal of Forecasting 14, 477 497 . Siliverstovs, B., van Dijk, D. (2003). Forecasting industrial. Journal of Economic Surveys 13, 551–576. Harvey, A.C. (2006). Forecasting with unobserved components time series models”. In: Elliott, G., Granger, C.W.J., Timmermann, A. (Eds.), Handbook of Economic