Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 32 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
32
Dung lượng
384,15 KB
Nội dung
Time series models 267 The acf can now be obtained by dividing the covariances by the variance, so that τ 0 = γ 0 γ 0 = 1 (8A.72) τ 1 = γ 1 γ 0 = φ 1 σ 2 1 −φ 2 1 σ 2 1 −φ 2 1 = φ 1 (8A.73) τ 2 = γ 2 γ 0 = φ 2 1 σ 2 1 −φ 2 1 σ 2 1 −φ 2 1 = φ 2 1 (8A.74) τ 3 = φ 3 1 (8A.75) The autocorrelation at lag s is given by τ s = φ s 1 (8A.76) which means that corr(y t ,y t−s ) = φ s 1 . Note that use of the Yule–Walker equations wouldhavegiventhesameanswer. 9 Forecast evaluation Learning outcomes In this chapter, you will learn how to ● compute forecast evaluation tests; ● distinguish between and evaluate in-sample and out-of-sample forecasts; ● undertake comparisons of forecasts from alternative models; ● assess the gains from combining forecasts; ● run rolling forecast exercises; and ● calculate sign and direction predictions. In previous chapters, we focused on diagnostic tests that the realestate analyst can compute to choose between alternative models. Once a model or competing models have been selected, we really want to know how accurately these models forecast. Forecast adequacy tests complement the diagnostic checking that we performed in earlier chapters and can be used as additional criteria to choose between two or more models that have satisfactory diagnostics. In addition, of course, assessing a model’s forecast performance is also of interest in itself. Determining the forecasting accuracy of a model is an important test of its adequacy. Some econometricians would go as far as to suggest that the statistical adequacy of a model, in terms of whether it violates the CLRM assumptions or whether it contains insignificant parameters, is largely irrel- evant if the model produces accurate forecasts. This chapter presents commonly used forecast evaluation tests. The lit- erature on forecast accuracy is large and expanding. In this chapter, we draw upon conventional forecast adequacy tests, the application of which generates useful information concerning the forecasting ability of different models. 268 Forecast evaluation 269 At the outset we should point out that forecast evaluation can take place with a number of different tests. The choice of which to use depends largely on the objectives of the forecast evaluation exercise. These objectives and tasks to accomplish in the forecast evaluation process are illustrated in this chapter. In addition, we review a number of studies that undertake forecast evaluation so as to illustrate alternative aspects of and approaches to the evaluation process, all of which have practical value. The computation of the forecast metrics we present below revolves around the forecast errors. We define the forecast error as the actual value minus the forecast value (although, in the literature, the forecast error is sometimes specified as the forecast value minus the actual value). We can categorise four influences that determine the size of the forecast error. (1) Poor specification on the part of the model. (2) Structural events: major events that change the nature of the relation- ship between the variables permanently. (3) Inaccurate inputs to the model. (4) Random events: unpredictable circumstances that are short-lived. The forecast evaluation analysis in this chapter aims to expose poor model specification that is reflected in the forecast error. We neutralise the impact of inaccurate inputs on the forecast error by assuming perfect information about the future values of the inputs. Our analysis is still subject tostructural impacts and random events on the forecast error, however. Unfortunately, there is not much that can be done – at least, not quantitatively – when these occur out of the sample. 9.1 Forecast tests An object of crucial importance in measuring forecast accuracy is the loss function, defined as L(A t+n ,F t+n,t ) or L( ˆ e t+n,t ),whereA is the realisations (actual values), F is the forecast series, ˆ e t+n,t is the forecast error A t+n – F t+n,t and n is the forecast horizon. A t+n is the realisation at time t + n and F t+n,t is the forecast for time t + n made at time t (n periods beforehand). The loss function charts the ‘loss’ or ‘cost’ associated with the forecasts and realisa- tions (see Diebold and Lopez, 1996). Loss functions differ, as they depend on the situation at hand (see Diebold, 1993). The loss function of the fore- cast by a government agency will differ from that of a company forecasting the economy or forecastingreal estate. A forecaster may be interested in volatility or mean accuracy or the contribution of alternative models to more accurate forecasting. Thus the appropriate accuracy measure arises 270 RealEstateModellingandForecasting from the loss function that best describes the utility of the forecast user regarding the forecast error. In the literature on forecasting, several measures have been proposed to describe the loss function. These measures of forecast quality can be grouped into a number of categories, including forecast bias, sign predictability, fore- cast accuracy with emphasis on large errors, forecast efficiency and encom- passing. The evaluation of the forecast performance on these measures takes place through the computation of the appropriate statistics. The question frequently arises as to whether there is systematic bias in a forecast. It is obviously a desirable property that the forecast is not biased. The null hypothesis is that the model produces forecasts that lead to errors with a zero mean. A t-test can be calculated to determine whether there is a statistically significant negative or positive bias in the forecasts. For simplicity of exposition, letting the subscript i now denote each observation for which the forecast has been made and the error calculated, the mean error ME or mean forecast error MFE is defined as ME = 1 n n i=1 ˆ e i (9.1) where n is the number of periods that the model forecasts. Another conventional error measure is the mean absolute error MAE, which is the average of the differences between the actual and forecast values in absolute terms, and it is also sometimes termed the mean absolute forecast error MAFE. Thus an error of −2 per cent or +2 per cent will have the same impact on the MAE of 2 per cent. The MAE formula is MAE = 1 n n i=1 | ˆ e i | (9.2) Since both ME and MAE are scale-dependent measures (i.e. they vary with the scale of the variable being forecast), a variant often reported is the mean absolute percentage error MAPE: MAPE = 100% n n i=1 A i − F i A i (9.3) The mean absolute error and the mean absolute percentage error both use absolute values of the forecast errors, which prevent positive and negative errors from cancelling each other out. The above measures are used to assess how closely individual predictions track their corresponding real data figures. In practice, when the series under investigation is already Forecast evaluation 271 expressed in percentage terms, the MAE criterion is sufficient. Therefore, if we forecast rent growth (expressed as a percentage), MAE is used. If we forecast the actual rent or a rent index, however, MAPE facilitates forecast comparisons. Another set of tests commonly used in forecast comparisons builds on the variance of the forecast errors. An important statistic from which other metrics are computed is the mean squared error MSE or, equivalently, the mean squared forecast error MSFE: MSE = 1 n n i=1 ˆ e 2 i (9.4) MSE will have units of the square of the data – i.e. of A t 2 . In order to produce a statistic that is measured on the same scale as the data, the root mean squared error RMSE is proposed: RMSE = √ MSE (9.5) The MSE and RMSE measures have been popular methods to aggregate the deviations of the forecasts from their actual trajectory. The smaller the values of the MSE and RMSE, the more accurate the forecasts. Due to its similar scale with the dependent variable, the RMSE of a forecast can be compared to the standard error of the model. An RMSE higher than, say, twice the standard error does not suggest a good set of forecasts. The RMSE and MSE are useful when comparing different methods applied to the same set of data, but they should not be used when comparing data sets that have different scales (see Chatfield, 1988, and Collopy and Armstrong, 1992). The MSE and RMSE impose a greater penalty for large errors. The RMSE is a better performance criterion than measures such as MAE and MAPE when the variable of interest undergoes fluctuations and turning points. If the forecast misses these large changes, the RMSE will disproportionately penalise the larger errors. If the variable follows a steadier path, then other measures such as the mean absolute error may be preferred. It follows that the RMSE heavily penalises forecasts with a few large errors relative to forecasts with a large number of small errors. This is important for samples of the small size that we often encounter in real estate. A few large errors will produce higher RMSE and MSE statistics and may lead to the conclusion that the model is less fit for forecasting. Since these measures are sensitive to outliers, some authors (such as Armstrong, 2001) have recommended caution in their use for forecast accuracy evaluation. 272 RealEstateModellingandForecasting Given that the RMSE is scale-dependent, the root mean squared percent- age error (RMSPE) can also be used: RMSPE = 100% n n i=1 A i − F i A i 2 (9.6) As for MAE versus MAPE, if the series we forecast is in percentage terms, the RMSE suffices to illustrate comparisons and use of the RMSPE is unnecessary. Theil (1966, 1971) utilises the RMSE metric to propose an inequality coeffi- cient that measures the difference between the predicted and actual values in terms of change. An appropriate scalar in the denominator restricts the variations of the coefficient between zero and one: U1 = RMSE 1 n A 2 i + 1 n F 2 i (9.7) Theil’s U 1 coefficient ranges between zero and one; the closer the computed U1 for the forecast is to zero, the better the prediction. The MSE can be decomposed as the sum of three components that collec- tively explain 100 per cent of its variation. These components are the bias proportion, the variance proportion and the covariance proportion. These components are defined as Bias proportion: ( ¯ F − ¯ A) 2 MSE (9.8) Variance proportion: (σ F − σ A ) 2 MSE (9.9) Covariance proportion: 2σ F σ A [1 − ρ(F,A)] MSE (9.10) where ¯ F is the mean of the forecast values in the forecast period, ¯ A is the mean of the actual values in the forecast period, σ is the standard deviation and ρ is the correlation coefficient between A and F in the forecast period. The bias proportion indicates the part of the systematic error in the fore- casts that arises from the discrepancy of the average value of the forecast path from the mean of the actual path of the variable. Pindyck and Rubin- feld (1998) argue that a value above 0.1 or 0.2 is troubling. The variance proportion is an indicator of how different the variability of the forecasts is from that of the observed variable over the forecast horizon. Too large a value is also troubling. Finally, the covariance proportion measures the unsystematic error in the forecasts. The larger this component the better, since this would imply that most of the error is due to random events and does not arise from the inability of the model to replicate the mean of the actual series or its variance. Forecast evaluation 273 The second metric proposed by Theil, the U2 coefficient, assesses the contribution of the forecast against a naive rule (such as ‘no change’ – that is, the future values are forecast as the last available observed value) or, more generally, an alternative model: U2 = MSE MSE NAIVE 1/2 (9.11) Theil’s U 2 coefficient measures the adequacy of the forecast by the quadratic loss criterion. The U2 statistic takes a value of less than one if the model under investigation outperforms the naive one (since the MSE of the naive will be higher than the MSE of the model). If the naive model produces more accurate forecasts, the value of the U 2 metric will be higher than one. Of course, the naive approach here does not need to be the ‘no change’ extrapolation or a random walk, but other methods such as an exponential smoothing or an MA model could be used. This criterion can be generalised in order to assess the contributions of an alternative model relative to a base model or an existing model that the forecaster has been using. Again, if U 2 is less than one, the model under study (the MSE of which is shown in the numerator) is doing better than the base or existing model. An alternative statistic to illustrate the gains from using one model instead of an alternative is a measure that is explored by Diebold and Kilian (1997) and Galbraith (2003). This metric is also based on the variance of the forecast error and measures the gain in reducing the value of the MSE from not using the forecasts from a competing model. In essence, this is another way to report results. This statistic is given by C = MSE MSE ALT − 1 (9.12) where C, the proposed measure, compares the MSE of two forecasts. Turning to the category of forecast efficiency, the conventional test involves running a regression of the form ˆ e i = α + βA i + u i (9.13) where A is the series of actual values. Forecast efficiency requires that α = β = 0 (see Mincer and Zarnowitz, 1969). Equation (9.13) also provides the baseline for rationality. The right-hand side can be augmented with explanatory variables that the forecaster believes the forecasts do not cap- ture. Forecast rationality implies that all coefficients should be zero in any such regression. According to Mincer and Zarnowitz, equation (9.13) can also be used to test for bias. If a forecast is unbiased then α = 0. Tsolacos and McGough (1999) apply similar tests to examine rationality in office construction in the United Kingdom. They test whether their model 274 RealEstateModellingandForecasting of UK office construction efficiently incorporates all available information, including that contained in the past values of construction and whether multi-span forecasts are obtained recursively. It is found that the estimated model incorporates all available information, and that this information is consistently applied to future time periods. A regression-based test can also be used to examine forecast encompass- ing – that is, to examine whether the forecasts of a model encompass the forecasts of other models. A formal framework in the case of two competing forecasting models will require the estimation of a model by regressing the realised values on a constant and the two competing series of forecasts. If one forecast set encompasses the other, its regression coefficient will be one, and that of the other zero, with an intercept that also takes a value of zero. Hence the test equation is A i = α 0 + α 1 F 1t + α 2 F 2t + u i (9.14) where F 1t and F 2t are the two competing forecasts. If forecast F 1t encom- passes forecast F 2t , α 1 should be statistically significant and close to one, whereas the coefficient α 2 will not be significantly different from zero. 9.1.1 The difference between in-sample and out-of-sample forecasts These important concepts are defined and contrasted in box 9.1. Box 9.1 Comparing in-sample and out-of-sample forecasts ● In-sample forecasts are those generated for the same set of data that was used to estimate the model’s parameters. Essentially, in-sample forecasts are the fitted values from a regression model. ● One would expect the ‘forecasts’ of a model to be relatively good within the sample, for this reason. ● Therefore a sensible approach to model evaluation through an examination of forecast accuracy is not to use all the observations in estimating the model parameters but, rather, to hold some observations back. ● The latter sample, sometimes known as a hold-out sample,wouldbeusedto construct out-of-sample forecasts. 9.2 Application of forecast evaluation criteria to a simple regression model 9.2.1 Forecast evaluation for Frankfurt rental growth Our objective here is to evaluate forecasts from the model we constructed for Frankfurt rent growth in chapter 7 for a period of five years, which is a commonly used horizon in realestate forecasting. It is the practice in Forecast evaluation 275 Table 9.1 Regression models for Frankfurt office rents 1982–2002 1982–2007 Independent variables Coefficient t-ratio Coefficient t-ratio C −6.81 −1.8 −6.39 −1.9 VAC t−1 −3.13 −2.5 −2.19 −2.7 OFSg t 4.72 3.2 4.55 3.3 Adjusted R 2 0.53 0.59 Durbin–Watson statistic 1.94 1.81 Notes: The dependent variable is RRg, which is real rent growth; VAC is the change in vacancy; OFSg is services output growth in Frankfurt. empirical work in realestate to evaluate the forecasts at the end of the sample, particularly in markets with small data samples, since it is usu- ally thought that the most recent forecast performance best describes the immediate future performance. Examining forecast adequacy over succes- sive other periods provides a more robust picture of the model’s ability to forecast, however. We evaluate the forecast accuracy of model A in table 7.4 in the five-year period 2003 to 2007. We estimate the model until 2002 and we forecast the remaining five years in the sample. Table 9.1 presents the model estimates over the shorter sample period, along with the results we presented in table7.4forthewholesampleperiod. We observe that the sensitivity of rent growth to vacancy falls when we include the last five years of the sample. In the last five years rent growth appears to have become more sensitive to OFSg t . Adding five years of data therefore changes some of the characteristics of the model, which is to some extent a consequence of the small size of the sample in the first place. For the computation of forecasts, the analyst has two options as to which coefficients to use. First, to use the sub-sample coefficients (for the period 1982 to 2002) or to apply those estimated for the whole sample. We would expect coefficients estimated over a longer sample to ‘win’ over coefficients obtained from shorter samples, as the model is trained with additional and more recent data and therefore the forecasts using the latter should be more accurate. This does not replicate the real-time forecasting process, however, since we use information that was not available at that time. If we use the full-sample coefficients, we obtain the fitted values we presented in chapter 7 (in-sample forecasts – see box 9.1). The data to calculate the 276 RealEstateModellingandForecasting Table 9.2 Data and forecasts for rent growth in Frankfurt Sample for estimation RRg VAC OFSg 1982–2002 1982–2007 2002 −12.37 6.3 0.225 2003 −18.01 5.7 0.056 −26.26 −19.93 2004 −13.30 3.4 0.618 −21.73 −16.06 2005 −3.64 0.1 0.893 −13.24 −9.77 2006 −4.24 −0.2 2.378 4.10 4.21 2007 3.48 −2.3 2.593 6.05 5.85 Note: The forecasts are for the period 2003–7. Table 9.3 Calculation of forecasts for Frankfurt office rents Sample for estimation 1982–2002 1982–2007 2003 −6.81 −3.13 × 6.3 + 4.72 × 0.056 =−26.26 −6.39 − 2.19 × 6.3 + 4.55 × 0.056 =−19.93 2004 −6.81 −3.13 × 5.7 + 4.72 × 0.618 =−21.73 −6.39 − 2.19 × 5.7 + 4.55 × 0.618 =−16.06 . . . . . . . . . 2007 −6.81 −3.13 ×−0.2 +4.72 × 2.593 = 6.05 −6.39 −2.19 ×−0.2 +4.55 × 2.593 = 5.85 forecasts are given in table 9.2, and table 9.3 demonstrates how to perform the calculations. Hence the forecasts from the two models are calculated using the follow- ing formulae: sub-sample coefficients (1982–2002): RRg 03 =−6.81 −3.13 × VAC 02 +4.72 × OFSg 03 (9.15) full-sample coefficients (1982–2007): RRg 03 =−6.39 −2.19 × VAC 02 +4.55 × OFSg 03 (9.16) For certain years the forecast from the sub-sample is more accurate than the full-sample model’s – for example, in 2003. Overall, however, we would expect the full-sample coefficients to yield more accurate forecasts. A com- parison of the forecasts with the actual values confirms this (e.g. in 2003 and 2005). From this comparison, we can obtain an idea of the size of the error, which is fairly large in 2005 and 2006 in particular. We proceed with the cal- culation of the forecast evaluation tests and undertake a formal assessment of forecast performance. [...]... logged and the first differences are taken (denoted by l) In equations (9.25) and (9.26), slow adjustments in the market and the weight of previous rents on current rents are allowed through rents lagged by one year ( lRENT t−1 ) The most significant lags of the terms lGDPt−i and lSPENDt−i were selected by minimising the value of the Schwarz information criterion 294 RealEstateModellingand Forecasting. .. a series, have been found to be more 288 RealEstateModellingandForecasting profitable (Leitch and Tanner, 1991) Two possible indicators of the ability of a model to predict direction changes irrespective of their magnitude are those suggested by Pesaran and Timmerman (1992) andby Refenes (1995) Defining the actual value of the series at time t + s as At+s and the forecast for that series s steps... the regression models, the first observations for the one- and two-year real rent and return forecasts are generated by estimating the models up to 1998 and making predictions for 1999 and 2000 The sample then increases by one observation and the regression models are re-estimated to make further one-step- and two-step-ahead forecasts (for 2000 and 2001, respectively) This recursive process continues... The observation we made of the previous model regarding the coefficients on vacancy and 280 RealEstateModellingandForecasting output can also be made in the case of this one By adding five observations (2003 to 2007), the vacancy coefficient more than halves, suggesting a lower impact on real rent growth On the other hand, the coefficient on OFSgt denotes a higher sensitivity Using the coefficients estimated... for inflation) and shorter and longer leading indicators The authors apply stepwise regression, which entails the search for the variables (or terms) from among the above list that maximise the explanatory power of the model The final specification of this model includes three employment terms (contemporaneous employment and 292 Real Estate Modelling andForecasting employment at lags 1 and 3), new construction... requires just a one-year forecast for interest rates to predict real returns two years ahead The forecasts for the base rate are taken from two different sources: the Bank of England’s Inflation Report for 1999, 2000 and 2001 (published in February) and from the ‘Treasury Survey’ for the subsequent years (February 296 Real Estate Modelling andForecasting Table 9.13 Evaluation of two-year-ahead forecasts... worth considering for future out-of-sample forecasts On the topic of forecast combination in real estate, the reader is also referred to the paper by Wilson and Okunev (2001), who combine negatively correlated forecasts for securitised real estate returns in the United States, the United Kingdom and Australia and assess the improvement over Forecast evaluation 283 Table 9.7 Evaluating the combination... survey of February 2003 The one- and two-year forecasts based on the models described earlier are real- time forecasts This means that the models contain information available to the analyst at the time of the forecast The benchmark AR(1) specifications for real rent andreal returns are initially estimated up to 1998 and forecasts are produced for 1999 and 2000 These real forecasts are converted into... with the realised values for only two years: 1996 and 1997 The regression model over-predicted by 3.6 percentage points in 1996 andby three percentage points in 1997 The naive methods, which are variants of the exponential smoothing approach, under-predict by larger margins of 5.3 and 17.9 percentage points, respectively The authors also examine the regression model forecast performance by comparing... discussion, but, for information, the models are also based on the vacancy and output variables We apply equation (9.13) to study forecast efficiency for all three forecast models, in this case using a t subscript to denote each observation, since 284 Real Estate Modelling andForecasting Table 9.8 Data on real rent growth for forecast efficiency and encompassing tests Forecast values Forecast errors Actual RM1 . forecast of − 19. 43 (model B for 2005) is obtained as: 0 .90 −1.32 × 18.3 + 4.28 × 0. 893 . 286 Real Estate Modelling and Forecasting Table 9. 9 Coefficient values from rolling estimations, data and forecasts (a). −F Naive 2003 −18.01 − 19. 93 1 .92 1 .92 3. 69 324.36 397 .20 −12.37 31.81 2004 −13.30 −16.06 2.76 2.76 7.62 176. 89 257 .92 −12.37 0.88 2005 −3.64 9. 77 6.13 6.13 37.58 13.25 95 .45 −12.37 76.21 2006. see box 9. 1). The data to calculate the 276 Real Estate Modelling and Forecasting Table 9. 2 Data and forecasts for rent growth in Frankfurt Sample for estimation RRg VAC OFSg 198 2–2002 198 2–2007 2002