part © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in Business Analytics: Data Analysis and Chapter Decision Making 12 Time Series Analysis and Forecasting Introduction Forecasting is a very difficult task, both in the short run and in the long run Analysts search for patterns or relationships in historical data and then make forecasts There are two problems with this approach: It is not always easy to undercover historical patterns or relationships It is often difficult to separate the noise, or random behavior, from the underlying patterns Some forecasts may attribute importance to patterns that are in fact random variations and are unlikely to repeat themselves There are no guarantees that past patterns will continue in the future © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Forecasting Methods: An Overview There are many forecasting methods available, and there is little agreement as to the best forecasting method The methods can be divided into three groups: Judgmental methods Extrapolation (or time series) methods Econometric (or causal) methods The first method is basically nonquantitative; the last two are quantitative © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Extrapolation Models Extrapolation models are quantitative models that use past data of a time series variable to forecast future values of the variable Many extrapolation models are available: Trend-based regression Autoregression Moving averages Exponential smoothing All of these methods look for patterns in the historical series and then extrapolate these patterns into the future Complex models are not always better than simpler models Simpler models track only the most basic underlying patterns and can be more flexible and accurate in forecasting the future © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Econometric Models Econometric models, also called causal or regression-based models, use regression to forecast a time series variable by using other explanatory time series variables Prediction from regression equation: Causal regression models present mathematical challenges, including: Determining the appropriate “lags” for the regression equation Deciding whether to include lags of the dependent variable as explanatory variables Autocorrelation (correlation of a variable with itself) and cross-correlation (correlation of a variable with a lagged version of another variable) © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Combining Forecasts This method combines two or more forecasts to obtain the final forecast The reasoning is simple: The forecast errors from different forecasting methods might cancel one another Forecasts that are combined can be of the same general type, or of different types The number of forecasts to combine and the weights to use in combining them have been the subject of several research studies © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Components of Time Series Data (slide of 4) If observations increase or decrease regularly through time, the time series has a trend Linear trend—occurs if the observations increase by the same amount from period to period Exponential trend—occurs when observations increase at a tremendous rate S-shape trend—occurs when it takes a while for observations to start increasing, but then a rapid increase occurs, before finally tapering off to a fairly constant level © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Components of Time Series Data (slide of 4) If a time series has a seasonal component, it exhibits seasonality—that is, the same seasonal pattern tends to repeat itself every year © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Components of Time Series Data (slide of 4) A time series has a cyclic component when business cycles affect the variables in similar ways The cyclic component is more difficult to predict than the seasonal component, because seasonal variation is much more regular The length of the business cycle varies, sometimes substantially The length of a seasonal cycle is generally one year, while the length of a business cycle is generally longer than one year and its actual length is difficult to predict © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Components of Time Series Data (slide of 4) Random variation (or noise) is the unpredictable component that gives most time series graphs their irregular, zigzag appearance A time series can be determined only to a certain extent by its trend, seasonal, and cyclic components; other factors determine the rest These other factors combine to create a certain amount of unpredictability in almost all time series © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Example 12.5 (Continued): House Sales.xlsx (slide of 2) Objective: To see how well a simple exponential smoothing model, with an appropriate smoothing constant, fits the housing sales data, and to see how StatTools implements this method Solution: Select Forecast from the StatTools Time Series and Forecasting dropdown list Then select the simple exponential smoothing option in the Forecast Settings tab, and choose a smoothing constant The results are shown to the right © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Example 12.5 (Continued): House Sales.xlsx (slide of 2) The graph below shows the forecast series superimposed on the original series © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Holt’s Model for Trend When there is a trend in the series, Holt’s method deals with it explicitly by including a trend term, Tt, and a corresponding smoothing constant β The interpretation of L is exactly as before t The interpretation of T is that it represents an estimate of the change in t the series from one period to the next The equations for Holt’s model are shown below: © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Example 12.5 (Continued): House Sales.xlsx (slide of 2) Objective: To see whether Holt’s method, with appropriate smoothing constants, captures the trends in the housing sales data better than simple exponential smoothing (or moving averages) Solution: Implement Holt’s method in StatTools almost exactly as for simple exponential smoothing The only difference is that you now choose two smoothing constants The output is very similar to the simple exponential smoothing output, except that there is now a trend column © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Example 12.5 (Continued): House Sales.xlsx (slide of 2) Now perform a second run of Holt’s method, using the Optimize Parameters option The forecasts with nonoptimal smoothing constants are shown below, on the left The forecasts with optimal smoothing constants are shown below, on the right © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Seasonal Models Seasonality is the consistent month-to-month (or quarter-to-quarter) differences that occur each year The easiest way to check for seasonality is graphically: Look for a regular pattern of ups and/or downs in particular months or quarters There are three basic methods for dealing with seasonality: Winters’ exponential smoothing model Deseasonalizing the data (then use any forecasting method to model the deseasonalized data and finally “reseasonalize” these forecasts) Multiple regression with dummy variables for the seasons Seasonal models are classified as additive or multiplicative In an additive seasonal model, an appropriate seasonal index is added to a base forecast The indexes, one for each season, typically average to In a multiplicative seasonal model, a base forecast is multiplied by an appropriate seasonal index These indexes, one for each season, typically average to © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Winters’ Exponential Smoothing Model Winters’ exponential smoothing model is very similar to Holt’s model, but it also has seasonal indexes and a corresponding smoothing constant γ This new smoothing constant controls how quickly the method reacts to observed changes in the seasonality pattern If the constant is small, the method reacts slowly If it is large, the method reacts more quickly The equations for this method are shown below: © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Example 12.6: Soft Drink Sales.xlsx (slide of 2) Objective: To see how well Winters’ method, with appropriate smoothing constants, can forecast the company’s seasonal soft drink sales Solution: Data file contains quarterly sales for a large soft drink company from quarter of 1997 through quarter of 2012 There has been an upward trend in sales during this period, and there is also a fairly regular seasonal pattern: sales in the warmer quarters are consistently higher than in the colder quarters Proceed in StatTools exactly as with the other exponential smoothing methods, but hold out some of the data for validation Fill in the Forecast Settings tab, selecting Winters’ method, basing the model on the data through Q4-2010, holding out eight quarters of data (Q1-2011 through Q4-2012), and forecasting four quarters into the future (all of 2013) © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Example 12.6: Soft Drink Sales.xlsx (slide of 2) Parts of the output are shown below, on the left The plot of the forecasts superimposed on the original series is shown below, on the right © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Deseasonalizing: The Ratio-to-Moving-Averages Method Most methods for deseasonalizing time series data are variations of the ratio-to-moving-averages method To deseasonalize an observation (assuming a multiplicative model of seasonality), divide it by the appropriate seasonal index To find the seasonal index for a particular month, divide the month’s observation by the average of the 12 observations surrounding it There is a minor problem with this approach: Any one month is not in the middle of any 12-month sequence Compromise by averaging the two possible averages (For June, this would be the January-to-December and December-to-November averages.) This is called a centered average The usual way to combine all of the indexes for a specific month (if the series covers several years) is to average them © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Example 12.6 (Continued): Soft Drink Sales.xlsx (slide of 2) Objective: To use the ratio-to-moving-averages method to deseasonalize the soft drink data and then forecast the deseasonalized data Solution: In StatTools, proceed as with the other exponential smoothing methods, but check the Deseasonalize option in the Time Scale tab of the Forecast dialog box Selected outputs are shown below © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Example 12.6 (Continued): Soft Drink Sales.xlsx (slide of 2) The deseasonalized data, with forecasts superimposed, are shown below, on the left The results of reseasonalizing are shown below, on the right © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Estimating Seasonality with Regression A regression approach to forecasting seasonal data uses dummy variables for the seasons Depending on how the regression equation is written, you can create either an additive or a multiplicative seasonal model For example, for quarterly data, create three dummy variables for the first three quarters (using quarter as the reference quarter) and estimate the additive equation: Then the coefficients of the dummy variables, b1, b2, and b3, indicate how much each quarter differs from the reference quarter, and the coefficient b represents the trend It is also possible to estimate a multiplicative model using dummy variables for seasonality (and possibly time for trend) An advantage of this approach is that it provides a model with multiplicative seasonal factors and is fairly easy to interpret © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Example 12.6 (Continued): Soft Drink Sales.xlsx (slide of 2) Objective: To use a multiplicative regression equation, with dummy variables for seasons and a time variable for trend, to forecast soft drink sales Solution: The data setup is shown below, with dummy variables for three of the four quarters and a Log(Sales) variable Then use multiple regression, with Log(Sales) as the dependent variable, and Time, Q1, Q2, and Q3 as the explanatory variables © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part Example 12.6 (Continued): Soft Drink Sales.xlsx (slide of 2) The regression output is shown on the top right A plot of observations versus forecasts for this model is shown on the bottom right © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part ... sometimes substantially The length of a seasonal cycle is generally one year, while the length of a business cycle is generally longer than one year and its actual length is difficult to predict... residuals visually— although this is not always reliable © 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole... related to their own past values In positive autocorrelation, large observations tend to follow large observations, and small observations tend to follow small observations The autocorrelation