Engineering Statistics Handbook Episode 8 Part 12 pdf

6. Process or Product Monitoring and Control 6.4. Introduction to Time Series Analysis 6.4.4. Univariate Time Series Models 6.4.4.3.Seasonality Seasonality Many time series display seasonality. By seasonality, we mean periodic fluctuations. For example, retail sales tend to peak for the Christmas season and then decline after the holidays. So time series of retail sales will typically show increasing sales from September through December and declining sales in January and February. Seasonality is quite common in economic time series. It is less common in engineering and scientific data. If seasonality is present, it must be incorporated into the time series model. In this section, we discuss techniques for detecting seasonality. We defer modeling of seasonality until later sections. Detecting Seasonality he following graphical techniques can be used to detect seasonality. A run sequence plot will often show seasonality.1. A seasonal subseries plot is a specialized technique for showing seasonality. 2. Multiple box plots can be used as an alternative to the seasonal subseries plot to detect seasonality. 3. The autocorrelation plot can help identify seasonality.4. Examples of each of these plots will be shown below. The run sequence plot is a recommended first step for analyzing any time series. Although seasonality can sometimes be indicated with this plot, seasonality is shown more clearly by the seasonal subseries plot or the box plot. The seasonal subseries plot does an excellent job of showing both the seasonal differences (between group patterns) and also the within-group patterns. The box plot shows the seasonal difference (between group patterns) quite well, but it does not show within group patterns. However, for large data sets, the box plot is usually easier to read than the seasonal subseries plot. Both the seasonal subseries plot and the box plot assume that the 6.4.4.3. Seasonality http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc443.htm (1 of 5) [5/1/2006 10:35:20 AM] seasonal periods are known. In most cases, the analyst will in fact know this. For example, for monthly data, the period is 12 since there are 12 months in a year. However, if the period is not known, the autocorrelation plot can help. If there is significant seasonality, the autocorrelation plot should show spikes at lags equal to the period. For example, for monthly data, if there is a seasonality effect, we would expect to see significant peaks at lag 12, 24, 36, and so on (although the intensity may decrease the further out we go). Example without Seasonality The following plots are from a data set of southern oscillations for predicting el nino. Run Sequence Plot No obvious periodic patterns are apparent in the run sequence plot. 6.4.4.3. Seasonality http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc443.htm (2 of 5) [5/1/2006 10:35:20 AM] Seasonal Subseries Plot The means for each month are relatively close and show no obvious pattern. Box Plot As with the seasonal subseries plot, no obvious seasonal pattern is apparent. Due to the rather large number of observations, the box plot shows the difference between months better than the seasonal subseries plot. 6.4.4.3. Seasonality http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc443.htm (3 of 5) [5/1/2006 10:35:20 AM] Example with Seasonality The following plots are from a data set of monthly CO2 concentrations. A linear trend has been removed from these data. Run Sequence Plot This plot shows periodic behavior. However, it is difficult to determine the nature of the seasonality from this plot. Seasonal Subseries Plot The seasonal subseries plot shows the seasonal pattern more clearly. In 6.4.4.3. Seasonality http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc443.htm (4 of 5) [5/1/2006 10:35:20 AM] this case, the CO 2 concentrations are at a minimun in September and October. From there, steadily the concentrations increase until June and then begin declining until September. Box Plot As with the seasonal subseries plot, the seasonal pattern is quite evident in the box plot. 6.4.4.3. Seasonality http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc443.htm (5 of 5) [5/1/2006 10:35:20 AM] This plot allows you to detect both between group and within group patterns. If there is a large number of observations, then a box plot may be preferable. Definition Seasonal subseries plots are formed by Vertical axis: Response variable Horizontal axis: Time ordered by season. For example, with monthly data, all the January values are plotted (in chronological order), then all the February values, and so on. In addition, a reference line is drawn at the group means. The user must specify the length of the seasonal pattern before generating this plot. In most cases, the analyst will know this from the context of the problem and data collection. Questions The seasonal subseries plot can provide answers to the following questions: Do the data exhibit a seasonal pattern?1. What is the nature of the seasonality?2. Is there a within-group pattern (e.g., do January and July exhibit similar patterns)? 3. Are there any outliers once seasonality has been accounted for?4. Importance It is important to know when analyzing a time series if there is a significant seasonality effect. The seasonal subseries plot is an excellent tool for determining if there is a seasonal pattern. Related Techniques Box Plot Run Sequence Plot Autocorrelation Plot Software Seasonal subseries plots are available in a few general purpose statistical software programs. They are available in Dataplot. It may possible to write macros to generate this plot in most statistical software programs that do not provide it directly. 6.4.4.3.1. Seasonal Subseries Plot http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc4431.htm (2 of 2) [5/1/2006 10:35:20 AM] Autoregressive (AR) Models A common approach for modeling univariate time series is the autoregressive (AR) model: where X t is the time series, A t is white noise, and with denoting the process mean. An autoregressive model is simply a linear regression of the current value of the series against one or more prior values of the series. The value of p is called the order of the AR model. AR models can be analyzed with one of various methods, including standard linear least squares techniques. They also have a straightforward interpretation. Moving Average (MA) Models Another common approach for modeling univariate time series models is the moving average (MA) model: where X t is the time series, is the mean of the series, A t-i are white noise, and 1 , , q are the parameters of the model. The value of q is called the order of the MA model. That is, a moving average model is conceptually a linear regression of the current value of the series against the white noise or random shocks of one or more prior values of the series. The random shocks at each point are assumed to come from the same distribution, typically a normal distribution, with location at zero and constant scale. The distinction in this model is that these random shocks are propogated to future values of the time series. Fitting the MA estimates is more complicated than with AR models because the error terms are not observable. This means that iterative non-linear fitting procedures need to be used in place of linear least squares. MA models also have a less obvious interpretation than AR models. Sometimes the ACF and PACF will suggest that a MA model would be a better model choice and sometimes both AR and MA terms should be used in the same model (see Section 6.4.4.5). Note, however, that the error terms after the model is fit should be independent and follow the standard assumptions for a univariate process. 6.4.4.4. Common Approaches to Univariate Time Series http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc444.htm (2 of 3) [5/1/2006 10:35:21 AM] Box-Jenkins Approach Box and Jenkins popularized an approach that combines the moving average and the autoregressive approaches in the book "Time Series Analysis: Forecasting and Control" (Box, Jenkins, and Reinsel, 1994). Although both autoregressive and moving average approaches were already known (and were originally investigated by Yule), the contribution of Box and Jenkins was in developing a systematic methodology for identifying and estimating models that could incorporate both approaches. This makes Box-Jenkins models a powerful class of models. The next several sections will discuss these models in detail. 6.4.4.4. Common Approaches to Univariate Time Series http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc444.htm (3 of 3) [5/1/2006 10:35:21 AM] Stages in Box-Jenkins Modeling There are three primary stages in building a Box-Jenkins time series model. Model Identification1. Model Estimation2. Model Validation3. Remarks The following remarks regarding Box-Jenkins models should be noted. Box-Jenkins models are quite flexible due to the inclusion of both autoregressive and moving average terms. 1. Based on the Wold decomposition thereom (not discussed in the Handbook), a stationary process can be approximated by an ARMA model. In practice, finding that approximation may not be easy. 2. Chatfield (1996) recommends decomposition methods for series in which the trend and seasonal components are dominant. 3. Building good ARIMA models generally requires more experience than commonly used statistical methods such as regression. 4. Sufficiently Long Series Required Typically, effective fitting of Box-Jenkins models requires at least a moderately long series. Chatfield (1996) recommends at least 50 observations. Many others would recommend at least 100 observations. 6.4.4.5. Box-Jenkins Models http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc445.htm (2 of 2) [5/1/2006 10:35:21 AM] Identify p and q Once stationarity and seasonality have been addressed, the next step is to identify the order (i.e., the p and q) of the autoregressive and moving average terms. Autocorrelation and Partial Autocorrelation Plots The primary tools for doing this are the autocorrelation plot and the partial autocorrelation plot. The sample autocorrelation plot and the sample partial autocorrelation plot are compared to the theoretical behavior of these plots when the order is known. Order of Autoregressive Process (p) Specifically, for an AR(1) process, the sample autocorrelation function should have an exponentially decreasing appearance. However, higher-order AR processes are often a mixture of exponentially decreasing and damped sinusoidal components. For higher-order autoregressive processes, the sample autocorrelation needs to be supplemented with a partial autocorrelation plot. The partial autocorrelation of an AR(p) process becomes zero at lag p+1 and greater, so we examine the sample partial autocorrelation function to see if there is evidence of a departure from zero. This is usually determined by placing a 95% confidence interval on the sample partial autocorrelation plot (most software programs that generate sample autocorrelation plots will also plot this confidence interval). If the software program does not generate the confidence band, it is approximately , with N denoting the sample size. Order of Moving Average Process (q) The autocorrelation function of a MA(q) process becomes zero at lag q+1 and greater, so we examine the sample autocorrelation function to see where it essentially becomes zero. We do this by placing the 95% confidence interval for the sample autocorrelation function on the sample autocorrelation plot. Most software that can generate the autocorrelation plot can also generate this confidence interval. The sample partial autocorrelation function is generally not helpful for identifying the order of the moving average process. 6.4.4.6. Box-Jenkins Model Identification http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc446.htm (2 of 4) [5/1/2006 10:35:27 AM] [...]... see Brockwell and Davis (1 987 , 2002) http://www.itl.nist.gov/div8 98 /handbook/ pmc/section4/pmc446.htm (3 of 4) [5/1/2006 10:35:27 AM] 6.4.4.6 Box-Jenkins Model Identification Examples We show a typical series of plots for performing the initial model identification for 1 the southern oscillations data and 2 the CO2 monthly concentrations data http://www.itl.nist.gov/div8 98 /handbook/ pmc/section4/pmc446.htm... exhibit any significant non-stationarity or seasonality, we generate the autocorrelation and partial autocorrelation plots of the raw data Autocorrelation Plot The autocorrelation plot shows a mixture of exponentially decaying http://www.itl.nist.gov/div8 98 /handbook/ pmc/section4/pmc4461.htm (2 of 3) [5/1/2006 10:35: 28 AM] 6.4.4.6.1 Model Identification for Southern Oscillations Data and damped sinusoidal... model with no seasonal terms and no differencing or trend removal Model validation should be performed before accepting this as a final model http://www.itl.nist.gov/div8 98 /handbook/ pmc/section4/pmc4461.htm (3 of 3) [5/1/2006 10:35: 28 AM] ... Data and damped sinusoidal components This indicates that an autoregressive model, with order greater than one, may be appropriate for these data The partial autocorrelation plot should be examined to determine the order Partial Autocorrelation Plot The partial autocorrelation plot suggests that an AR(2) model might be appropriate In summary, our intial attempt would be to fit an AR(2) model with no... Models Difficult to Identify SHAPE Series is not stationary In practice, the sample autocorrelation and partial autocorrelation functions are random variables and will not give the same picture as the theoretical functions This makes the model identification more difficult In particular, mixed models can be particularly difficult to identify Although experience is helpful, developing good models using these... autocorrelation function for model identification INDICATED MODEL Exponential, decaying to zero Autoregressive model Use the partial autocorrelation plot to identify the order of the autoregressive model Alternating positive and negative, decaying to zero Autoregressive model Use the partial autocorrelation plot to help identify the order One or more spikes, rest are essentially zero Moving average model, . sample partial autocorrelation function is generally not helpful for identifying the order of the moving average process. 6.4.4.6. Box-Jenkins Model Identification http://www.itl.nist.gov/div8 98 /handbook/ pmc/section4/pmc446.htm. information on these techniques, see Brockwell and Davis (1 987 , 2002). 6.4.4.6. Box-Jenkins Model Identification http://www.itl.nist.gov/div8 98 /handbook/ pmc/section4/pmc446.htm (3 of 4) [5/1/2006 10:35:27. Identification for Southern Oscillations Data http://www.itl.nist.gov/div8 98 /handbook/ pmc/section4/pmc4461.htm (2 of 3) [5/1/2006 10:35: 28 AM] and damped sinusoidal components. This indicates that an autoregressive

Định dạng
Số trang	14
Dung lượng	108,79 KB