1842 ✦ Chapter 28: The TIMEID Procedure (Experimental) Output 28.1.6 Time ID Offsets Histogram The span diagnostics Output 28.1.7 and Output 28.1.8 show the distribution of the span sizes between successive DATE values. The TriWeek data set has three different span sizes of widths 0, 1 and 2. Here one span corresponds to the width of a WEEK3 interval. Output 28.1.7 Time ID Span Listings The TIMEID Procedure Component Value Index Span Frequency Percentage 1 0 1 0.704225 2 1 135 95.070423 3 2 6 4.225352 Example 28.1: Examining a Weekly Time ID Variable ✦ 1843 Output 28.1.7 continued Statistics Summary Standard Minimum Maximum Mean Deviation 0 2 1.0352113 0.6367974 Output 28.1.8 Time ID Span Histogram Output 28.1.9 and Output 28.1.10 show the distribution of time ID values before alignment to the WEEK3 interval. The listing in Output 28.1.9 has been truncated to include only the first 10 observations. 1844 ✦ Chapter 28: The TIMEID Procedure (Experimental) Output 28.1.9 Unaligned Time ID Listings Time ID Values for DATE Value Index date Frequency Percentage 1 Tue, 28 Dec 1948 1 0.694444 2 Tue, 18 Jan 1949 1 0.694444 3 Tue, 8 Feb 1949 1 0.694444 4 Tue, 1 Mar 1949 1 0.694444 5 Tue, 22 Mar 1949 1 0.694444 6 Tue, 12 Apr 1949 1 0.694444 7 Tue, 3 May 1949 1 0.694444 8 Tue, 24 May 1949 1 0.694444 9 Fri, 17 Jun 1949 1 0.694444 10 Tue, 5 Jul 1949 1 0.694444 Output 28.1.10 Unaligned Time ID Histogram Example 28.2: Inferring a Date Interval ✦ 1845 Example 28.2: Inferring a Date Interval This example illustrates how a time ID variable can be inferred from a data set when a sufficient number of obserations are present. data workdays; format day weekdate.; input day : date. @@; datalines; 01AUG09 06AUG09 11AUG09 14AUG09 19AUG09 22AUG09 27AUG09 01SEP09 04SEP09 09SEP09 12SEP09 17SEP09 ; proc timeid data=workdays print=interval; id day; run; The 12 observations in the WorkDays data set are enough to determine that the DAY time ID variable is represented by the WEEKDAY12W3 interval. The WEEKDAY12W3 interval corresponds to every third day of the week excluding Sundays and Mondays. Characteristics of this interval are shown in Output 28.2.1. Output 28.2.1 Inferred Time Interval Information The TIMEID Procedure Time Interval Analysis Summary Time ID Variable day Time Interval WEEKDAY12W3 Base Name WEEKDAY Multiplier 3 Shift 0 Length of Seasonal Cycle 5 Time ID Format WEEKDATE Start Saturday, August 1, 2009 End Thursday, September 17, 2009 1846 ✦ Chapter 28: The TIMEID Procedure (Experimental) Example 28.3: Examining Multiple BY Groups This example illustrates how a time ID variable can be examined independently over each BY group and summarized over all observations in the DATA= data set. data bygroups; format tid date.; input tid : date. by @@; datalines; more lines The following TIMEID procedure statements generate two data sets that summarize a data set with four BY groups. proc timeid data=bygroups outintervaldetails=int outinterval=intsum; id tid; by by; run; The summarized information in Output 28.3.1 shows that BY groups 2, 3, and 4 in the ByGroups data set contain some duplicate values and spans, and group 1 conforms exactly to the WEEKDAY17W interval. This listing also shows that the date ranges in these two BY groups start and end on different days and that they overlap between December 7, 2009, and December 28, 2009. Example 28.3: Examining Multiple BY Groups ✦ 1847 Output 28.3.1 Selected Variables in the Combined OUTINTERVALDETAILS= OUTINTERVAL= Data Sets P P C C N T N T P I I I O O C N N N F F N T S T T T F F S S T E S C C S S P P A R T N N E E A A T V A b T T T T N N U A R y N S S S S S S S L T 1 25 1 0.00 1 0 1 0.00000 0 WEEKDAY17W 24NOV09 2 25 2 0.08 1 0 2 0.00000 0 WEEKDAY17W 27NOV09 3 25 2 0.16 1 0 2 0.04348 0 WEEKDAY17W 02DEC09 4 25 2 0.24 1 0 2 0.13043 0 WEEKDAY17W 07DEC09 . 100 . . . . . . 0 WEEKDAY17W 24NOV09 S T E O A T S A O N L N S S C S E S E Y E A T A C A S A E S L S O R N O E O N T D N S N C S S C S A Y H H Y H L C A A C A E I L R R N L R N T E E E B E E D Y S D D Y S D 28DEC09 5 5 . . . . . 31DEC09 5 5 . . . . . 05JAN10 5 5 . . . . . 08JAN10 5 4 . . . . . 08JAN10 5 . 07DEC09 28DEC09 4 6 3 1848 Chapter 29 The TIMESERIES Procedure Contents Overview: TIMESERIES Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 1850 Getting Started: TIMESERIES Procedure . . . . . . . . . . . . . . . . . . . . . . . 1851 Syntax: TIMESERIES Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 1854 Functional Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1854 PROC TIMESERIES Statement . . . . . . . . . . . . . . . . . . . . . . . . . 1857 BY Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1860 CORR Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1861 CROSSCORR Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . 1862 DECOMP Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1863 ID Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1865 SEASON Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1868 SPECTRA Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1869 SSA Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1871 TREND Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1873 VAR and CROSSVAR Statements . . . . . . . . . . . . . . . . . . . . . . . 1874 Details: TIMESERIES Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 1876 Accumulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1876 Missing Value Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . 1879 Time Series Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 1879 Time Series Differencing . . . . . . . . . . . . . . . . . . . . . . . . . . . 1880 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1880 Seasonal Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1881 Correlation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1883 Cross-Correlation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 1884 Spectral Density Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 1885 Singular Spectrum Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 1888 Data Set Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1890 OUT= Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1891 OUTCORR= Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1891 OUTCROSSCORR= Data Set . . . . . . . . . . . . . . . . . . . . . . . . . 1892 OUTDECOMP= Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 1893 OUTSEASON= Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 1894 OUTSPECTRA= Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 1895 OUTSSA= Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1895 1850 ✦ Chapter 29: The TIMESERIES Procedure OUTSUM= Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1896 OUTTREND= Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1897 _STATUS_ Variable Values . . . . . . . . . . . . . . . . . . . . . . . . . . 1898 Printed Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1898 ODS Table Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1899 ODS Graphics Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1899 Examples: TIMESERIES Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 1901 Example 29.1: Accumulating Transactional Data into Time Series Data . . . . 1901 Example 29.2: Trend and Seasonal Analysis . . . . . . . . . . . . . . . . . 1902 Example 29.3: Illustration of ODS Graphics . . . . . . . . . . . . . . . . . . 1907 Example 29.4: Illustration of Spectral Analysis . . . . . . . . . . . . . . . . . 1911 Example 29.5: Illustration of Singular Spectrum Analysis . . . . . . . . . . 1913 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1916 Overview: TIMESERIES Procedure The TIMESERIES procedure analyzes time-stamped transactional data with respect to time and accumulates the data into a time series format. The procedure can perform trend and seasonal analysis on the transactions. After the transactional data are accumulated, time domain and frequency domain analysis can be performed on the accumulated time series. For seasonal analysis of the transaction data, various statistics can be computed for each season. For trend analysis of the transaction data, various statistics can be computed for each time period. The analysis is similar to applying the MEANS procedure of Base SAS software to each season or time period of concern. After the transactional data are accumulated to form a time series and any missing values are interpreted, the accumulated time series can be functionally transformed using log, square root, logistic, or Box-Cox transformations. The time series can be further transformed using simple and/or seasonal differencing. After functional and difference transformations have been applied, the accumulated and transformed time series can be stored in an output data set. This working time series can then be analyzed further using various time series analysis techniques provided by this procedure or other SAS/ETS procedures. Time series analyses performed by the TIMESERIES procedure include: descriptive (global) statistics seasonal decomposition/adjustment analysis correlation analysis cross-correlation analysis spectral analysis Getting Started: TIMESERIES Procedure ✦ 1851 All results of the transactional or time series analysis can be stored in output data sets or printed using the Output Delivery System (ODS). The TIMESERIES procedure can process large amounts of time-stamped transactional data. There- fore, the analysis results are useful for large-scale time series analysis or (temporal) data mining. All of the results can be stored in output data sets in either a time series format (default) or in a coordinate format (transposed). The time series format is useful for preparing the data for subsequent analysis with other SAS/ETS procedures. For example, the working time series can be further analyzed, modeled, and forecast with other SAS/ETS procedures. The coordinate format is useful when using this procedure with SAS/STAT procedures or SAS Enterprise Miner. For example, clustering time-stamped transactional data can be achieved by using the results of this procedure with the clustering procedures of SAS/STAT and the nodes of SAS Enterprise Miner. The EXPAND procedure can be used for the frequency conversion and transformations of time series output from this procedure. Getting Started: TIMESERIES Procedure This section outlines the use of the TIMESERIES procedure and gives a cursory description of some of the analysis techniques that can be performed on time-stamped transactional data. Given an input data set that contains numerous transaction variables recorded over time at no specific frequency, the TIMESERIES procedure can form time series as follows: PROC TIMESERIES DATA=<input-data-set> OUT=<output-data-set>; ID <time-ID-variable> INTERVAL=<frequency> ACCUMULATE=<statistic>; VAR <time-series-variables>; RUN; The TIMESERIES procedure forms time series from the input time-stamped transactional data. It can provide results in output data sets or in other output formats by using the Output Delivery System (ODS). Time-stamped transactional data are often recorded at no fixed interval. Analysts often want to use time series analysis techniques that require fixed-time intervals. Therefore, the transactional data must be accumulated to form a fixed-interval time series. Suppose that a bank wants to analyze the transactions associated with each of its customers over time. Further, suppose that the data set WORK.TRANSACTIONS contains four variables that are related to these transactions: CUSTOMER, DATE, WITHDRAWAL, and DEPOSITS. The following examples illustrate possible ways to analyze these transactions by using the TIMESERIES procedure. To accumulate the time-stamped transactional data to form a daily time series based on the accu- mulated daily totals of each type of transaction (WITHDRAWALS and DEPOSITS ), the following TIMESERIES procedure statements can be used: . 194 9 1 0. 694 444 5 Tue, 22 Mar 194 9 1 0. 694 444 6 Tue, 12 Apr 194 9 1 0. 694 444 7 Tue, 3 May 194 9 1 0. 694 444 8 Tue, 24 May 194 9 1 0. 694 444 9 Fri, 17 Jun 194 9 1 0. 694 444 10 Tue, 5 Jul 194 9 1 0. 694 444 Output. day weekdate.; input day : date. @@; datalines; 01AUG 09 06AUG 09 11AUG 09 14AUG 09 19AUG 09 22AUG 09 27AUG 09 01SEP 09 04SEP 09 09SEP 09 12SEP 09 17SEP 09 ; proc timeid data=workdays print=interval; id day; run; The. (Experimental) Output 28.1 .9 Unaligned Time ID Listings Time ID Values for DATE Value Index date Frequency Percentage 1 Tue, 28 Dec 194 8 1 0. 694 444 2 Tue, 18 Jan 194 9 1 0. 694 444 3 Tue, 8 Feb 194 9 1 0. 694 444 4 Tue,