The TIMESERIES Procedure

Singular Spectrum Analysis

Given a time series, $y_ t$, for $t=1,\ldots , T$, and a window length, $2\le L < T/2$, singular spectrum analysis Golyandina, Nekrutkin, and Zhigljavsky (2001) decompose the time series into spectral groupings using the following steps:

Embedding Step

Using the time series, form a $K\times L$ trajectory matrix, $\mathbf{X}$, with elements

\[  \mathbf{X}=\{ x_{k,l}\} ^{K,L}_{k=1,l=1}  \]

such that $x_{k,l}=y_{k-l+1}$ for $k=1,\ldots , K$and $l=1,\ldots , L$ and where $K=T-L+1$. By definition $L\le K < T$, because $2\le L<T/2$.

Decomposition Step

Using the trajectory matrix, $\mathbf{X}$, apply singular value decomposition to the trajectory matrix

\[  \mathbf{X}=\mathbf{U}\mathbf{Q}\mathbf{V}  \]

where $\mathbf{U}$ represents the $K\times L$ matrix that contains the left-hand-side (LHS) eigenvectors, where $\mathbf{Q}$ represents the diagonal $L\times L$ matrix that contains the singular values, and where $\mathbf{V}$ represents the $L\times L$ matrix that conatins the right-hand-side (RHS) eigenvectors.

Therefore,

\[  \mathbf{X} = \sum _{l=1}^{L}\mathbf{X}^{(l)} = \sum _{l=1}^{L}\mathbf{u}_ l q_ l \mathbf{v}_ l^ T  \]

where $\mathbf{X}^{(l)}$ represents the $K\times L$ principal component matrix, $\mathbf{u}_ l$ represents the $K\times 1$ left-hand-side (LHS) eigenvector, $q_ l$ represents the singular value, and $\mathbf{v}_ l$ represents the $L\times 1$ right-hand-side (RHS) eigenvector associated with the $l$th window index.

Grouping Step

For each group index, $m=1,\ldots , M$, define a group of window indices $I_ m \subset \{ 1,\ldots , L\} $. Let

\[  \mathbf{X}_{I_ m} = \sum _{l\in I_ m}\mathbf{X}^{(l)} = \sum _{l\in I_ m}\mathbf{u}_ l q_ l \mathbf{v}_ l^ T  \]

represent the grouped trajectory matrix for group $I_ m$. If groupings represent a spectral partition,

\[  \bigcup _{m=1}^ M I_ m = \{  1,\ldots ,L\}  \qquad \textrm{and}\qquad I_ m\cap I_ n = \emptyset \quad \textrm{for}\quad m\ne n  \]

then according to the singular value decomposition theory,

\[  \mathbf{X} = \sum _{m=1}^ M \mathbf{X}_{I_ m}  \]

Averaging Step

For each group index, $m=1,\ldots , M$, compute the diagonal average of $\mathbf{X}_{I_ m}$,

\[  \tilde{x}_ t^{(m)} = \frac{1}{n_ t} \sum _{l=s_ t}^{e_ t} x_{t-l+1,l}^{(m)}  \]

where

\begin{equation*} \begin{array}{lll@{\qquad }lrcl} s_ t = 1, & e_ t = t, & n_ t = t &  \textrm{for\  } & 1\le & t&  < L \\ s_ t = 1, & e_ t = L, & n_ t = L &  \textrm{for\  } & L\le & t&  \le T - L + 1 \\ s_ t = T-t-1, & e_ t = L, & n_ t = T-t+1 &  \textrm{for\  } & T-L+1 < & t&  \le T \end{array}\end{equation*}

If the groupings represent a spectral partition, then by definition

\[  y_ t = \sum _{m=1}^ M\tilde{x}_ t^{(m)}  \]

Hence, singular spectrum analysis additively decomposes the original time series, $y_ t$, into $m$ component series $\tilde{x}_ t^{(m)}$ for $m=1,\ldots , M$.

Specifying the Window Length

You can explicitly specify the maximum window length, $2\le L\le 1000$, using the LENGTH= option or implicitly specify the window length using the INTERVAL= option in the ID statement or the SEASONALITY= option in the PROC TIMESERIES statement.

Either way the window length is reduced based on the accumulated time series length, $T$, to enforce the requirement that $2\le L\le T/2$.

Specifying the Groups

You can use the GROUPS= option to explicitly specify the composition and number of groups, $I_ m\subset \{ 1,\ldots , L\} $ or use the THRESHOLDPCT= option in the SSA statement to implicitly specify the grouping. The THRESHOLDPCT= option is useful for removing noise or less dominant patterns from the accumulated time series.

Let $0<\alpha <1$ be the cumulative percent singular value THRESHOLDPCT=. Then the last group, $I_ M=\{ l_\alpha ,\ldots ,L\} $, is determined by the smallest value such that

\[  \left(\sum _{l=1}^{l_\alpha -1}q_ l \bigg/ \sum _{l=1}^{L}q_ l \right) \ge \alpha \qquad \textrm{where\  } 1 < l_\alpha \le L  \]

Using this rule, the last group, $I_ M$, describes the least dominant patterns in the time series and the size of the last group is at least one and is less than the window length, $L\ge 2$.

The magnitudes of the principal components which are plotted using the PLOT=SSA option and selected by the THRESHOLDPCT= option are based on the singular values which appear on the diagonal of $\mathbf{Q}$. Alternatively, each principal component’s contribution to variation in the series can be quantified by using the squares of the singular values. The relative contributions of the principal components to variation in the series are included in the printed tabular output produced by the PRINT=SSA option.