The SSM Procedure (Experimental)

Predefined Structural Models

A set of predefined models is available in the SSM procedure for models called structural models in the time series literature. These predefined models can be used to model trend, seasonal, and cyclical patterns in the univariate and multivariate time series. For the most part, the multivariate models are straightforward generalizations of the corresponding univariate models—for example, the multivariate random walk trend described later in this section generalizes the univariate random walk trend that is described in the section Random Walk Trend. All of these models, with the exception of the continuous-time cycle model, are applicable only to the regular data type. The continuous-time cycle model is applicable to all the data types; however, it is available for the univariate case only.

To specify these models, you must first use the STATE statement with the correct TYPE= option. When you specify the TYPE=option, you do not need to specify other options of the STATE statement (for example, the T option, the COV1 option, and the A1 option). However, you must specify the COV option, which describes the covariance of the disturbance term that drives the state equation. Throughout this section, the symmetric matrix specified by using the COV option is denoted by $\pmb {\Sigma }$. For TYPE= LL, an additional matrix, specified by using the SLOPECOV suboption, also plays a role; it is denoted by $\pmb {\Sigma }_{slope}$. Subsequently you must specify one or more COMPONENT statements to define the (univariate) components that are based on this state subsection for their inclusion in the MODEL statement. These univariate components exhibit interesting behavior based on the structure of $\pmb {\Sigma }$ (and $\pmb {\Sigma }_{slope}$, whenever applicable)—for example, imposing rank restrictions on $\pmb {\Sigma }$ in the multivariate random walk results in these univariate trends moving together. For additional information about these models, see Harvey (1989).

The following example summarizes the steps needed to define a multivariate structural model by using a sequence of STATE and COMPONENT statements. For a full example, see Example 27.1. Suppose that a three-dimensional time series is being studied with response variables y1, y2, and y3. Suppose you want to specify the trivariate structural model

\[  \mb {y}_{t} = \pmb {\mu }_{t} + \pmb {\psi }_{t} + \pmb {\epsilon }_{t}  \]

where $\mb {y}_{t} = (y_{1,t}, \; y_{2,t},\;  y_{3,t}) $ denotes the response series, and $ \pmb {\mu }_{t}$, $\pmb {\psi }_{t} $, and $\pmb {\epsilon }_{t} $ denote the trivariate components, trend, cycle, and white noise, respectively. The three components of $\pmb {\epsilon }_{t} $, the observation noise in the model, are not assumed to be independent. Therefore, you cannot specify them by using three IRREGULAR statements; you must include them in the state specification. The following (incomplete) statements show how to specify this model:

     state whiteNoise(3)  type=wn ...;
     component wn1 =  whiteNoise[1];
     component wn2 =  whiteNoise[2];
     component wn3 =  whiteNoise[3];
     
     state randomWalk(3)  type=rw ...;
     component rw1 =  randomWalk[1];
     component rw2 =  randomWalk[2];
     component rw3 =  randomWalk[3];
     
     state cycleState(3)  type=cycle ...;
     component c1 = cycleState[1];
     component c2 = cycleState[2];
     component c3 = cycleState[3];
     
     model y1 = rw1 c1 wn1;
     model y2 = rw2 c2 wn2;
     model y3 = rw3 c3 wn3;

The first STATE statement defines whiteNoise, a state subsection that is needed for defining a three-dimensional white noise component. In turn, whiteNoise is used to define the three univariate white noise components: wn1, wn2, and wn3. The components wn1, wn2, and wn3 are correlated—their correlation structure is controlled by the covariance specification of whiteNoise. The second set of STATE and COMPONENT statements result in three correlated random walk trend components: rw1, rw2, and rw3. Finally, the last set of STATE and COMPONENT statements result in three correlated cycle components: c1, c2, and c3. In the end, the desired multivariate model is defined by including these (univariate) components in the appropriate MODEL statements.

In the preceding example, it is important to note the relationship between the nominal dimension (denoted by dim throughout this section) that is specified in the STATE statement and the actual dimension of the resulting state subsection. Note that the three state subsections, whiteNoise, randomWalk, and cycleState, are defined by using the same dim specification: 3. However, the actual dimensions of these state subsections depend on their type; they do not need to equal this specified dimension. Here, whiteNoise and randomWalk do have the same size, 3, as the specified dim. However, the size of cycleState, which is of TYPE=CYCLE, is $2*dim = 6$. Another important point to note: no matter what the underlying size of the state subsection, the desired univariate components were obtained by using an identical specification scheme in the COMPONENT statement. This happens because the component specification style that is based on the element operator—[]—in the COMPONENT statement behaves differently when the TYPE= option is used to define the state subsection (see the section Multivariate Season for an illustration).

The system matrices for all these models are time-invariant, with the exception of the continuous-time cycle model. In this section, $ \pmb {\alpha }_{t}$ denotes the subsection of the overall model state $ \pmb {\alpha }_{t}$, and $\mb {T}$, $\mb {Q}$, and $\mb {A}_{1}$ denote the corresponding blocks of the larger system matrices.

For the multivariate cycle system matrices described in the section Multivariate Cycle, the Kronecker product notation is useful: if $\mb {A}$ is an $m \times n$ matrix and $\mb {B}$ is a $p \times q$ matrix, then the Kronecker product $\mb {A} \bigotimes \mb {B}$ is an $m p \times n q $ block matrix:

\[  \mb {A} \bigotimes \mb {B} = \left[ \begin{matrix}  a_{11} B   &  \cdots   &  a_{1n} B   \\ \vdots   &  \ddots   &  \vdots   \\ a_{m1} B   &  \cdots   &  a_{m n} B   \\ \end{matrix} \right]  \]

Multivariate White Noise

The STATE statement option TYPE=WN specifies white noise of dimension dim—that is, a sequence of zero mean, independent, Gaussian vectors with covariance $\pmb {\Sigma }$. The specification of the associated system matrices is trivial: $\mb {T}$ is zero, $\mb {Q} = \pmb {\Sigma }$, and the initial condition is nondiffuse ($\mb {Q}_{1} = \pmb {\Sigma }$ and $\mb {A}_{1} = 0$).

Multivariate white noise is needed to specify the observation equation noise term for the multivariate models for the time series data. Since the state space formulation for the SSM procedure requires the observation equation noise vector to have the diagonal form, you need to include the noise vector in the state. The noise term for the $i$th response variable is defined by a component that simply picks the $i$th element of this multivariate white noise. For example, the component wn_i defined as follows can be used as a noise term in the MODEL statement of the $i$th response variable:

     state white(dim) type=wn ...;
     component wn_3 = white[3];

Multivariate Random Walk Trend

The STATE statement option TYPE=RW specifies a dim-dimensional random walk

\[  \pmb {\alpha }_{t+1} = \pmb {\alpha }_{t} + \pmb {\eta }_{t+1}  \]

where $ \pmb {\eta }_{t}$ is a sequence of zero mean, independent, Gaussian vectors with covariance $\pmb {\Sigma }$. The specification of the associated system matrices is trivial: $\mb {T}$ is a dim-dimensional identity matrix, $ \mb {I}_{dim}$, $\mb {Q} = \pmb {\Sigma }$, and the initial condition is fully diffuse ($\mb {Q}_{1} = 0$ and $\mb {A}_{1} = \mb {I}_{dim}$).

The multivariate random walk is a useful trend model for multivariate time series data. The trend term for the $i$th response variable is defined by a component that simply picks the $i$th ($1 \leq i \leq dim$) element of $\pmb {\alpha }_{t}$. For example, the component rw_i defined as follows can be used as a trend term in the MODEL statement of the $i$th response variable:

     state randomWalk(3) type=rw ...;
     component rw_2 = randomWalk[2];

Multivariate Local Linear Trend

The STATE statement option TYPE=LL specifies a (2*dim)-dimensional $\pmb {\alpha }_{t}$, needed for defining a dim-dimensional local linear trend. The first dim elements of $\pmb {\alpha }_{t}$ correspond to the needed multivariate trend, and the subsequent dim elements are needed to capture the slope vector of this trend. $\pmb {\alpha }_{t}$ can be defined as

\[  \pmb {\alpha }_{t+1} = \mb {T} \pmb {\alpha }_{t} + \pmb {\eta }_{t+1}  \]

where $ \pmb {\eta }_{t}$ is a sequence of zero mean, independent, Gaussian vectors with covariance $\mr {Diag}(\pmb {\Sigma }, \;  \pmb {\Sigma }_{slope})$ and $\mb {T}$ is a 2*dim-dimensional block matrix $\mb {T} = (\mb {I}_{dim} \;  \mb {I}_{dim}, \;  \mb {0} \; \mb {I}_{dim} )$. The initial condition is fully diffuse ($\mb {Q}_{1} = 0$ and $\mb {A}_{1} =\mb {I}_{2*dim}$). This is a multivariate generalization of the univariate local linear trend.

The multivariate local linear trend is a useful trend model for multivariate time series data. The trend term for the $i$th response variable is defined by a component that simply picks the $i$th element ($1 \leq i \leq dim$) of $\pmb {\alpha }_{t}$. For example, the component ll_i defined as follows can be used as a trend term in the MODEL statement of the $i$th response variable:

     state localLin(dim) type=ll(slopecov..) ...;
     component ll_3 = localLin[3];

Multivariate Cycle

The STATE statement option TYPE=CYCLE specifies a (2*dim)-dimensional $\pmb {\alpha }_{t}$, needed for defining a dim-dimensional cycle. As in the LL case, the first dim elements of $\pmb {\alpha }_{t}$ correspond to the needed dim-dimensional cycle, and the remaining dim elements contain some auxiliary quantities. The cycle model defined in this subsection requires a regular data type—that is, the CT option is not included. Let $\rho $ denote the damping factor, and let $\lambda = 2\pi /$period be the frequency associated with the cycle. The admissible parameter ranges are $0 < \rho \leq 1$ and period $ > 2$, which implies that $0 < \lambda < \pi $. Let $\mb {C} = \rho (\cos (\lambda ) \;  \sin (\lambda ) , \;  -\sin (\lambda ) \;  \cos (\lambda ) ) $, a $2 \times 2$ matrix, and let $\mb {T} = \mb {C} \bigotimes \mb {I}_{dim}$, a $2*dim \times 2*dim$ matrix. With this notation, the transition equation associated with $\pmb {\alpha }_{t}$ is

\[  \pmb {\alpha }_{t+1} = \mb {T} \pmb {\alpha }_{t} + \pmb {\eta }_{t+1}  \]

where $ \pmb {\eta }_{t}$ is a sequence of zero mean, independent, $(2*dim)$-dimensional Gaussian vectors with covariance $\mr {Diag}(\pmb {\Sigma }, \; \pmb {\Sigma })$. If $\rho = 1$, the initial condition is fully diffuse ($\mb {Q}_{1} = 0$ and $\mb {A}_{1} =\mb {I}_{2*dim}$). Otherwise, it is nondiffuse: $\mb {Q}_{1} = \frac{1}{(1 - \rho ^{2})}\mr {Diag}(\pmb {\Sigma }, \; \pmb {\Sigma })$ and $\mb {A}_{1} =0$.

The multivariate cycle is useful for capturing periodic behavior for multivariate time series data. The cycle term for the $i$th response variable is defined by a component that simply picks the $i$th element of $\pmb {\alpha }_{t}$. For example, the component cycle_i defined as follows can be used as a cycle term in the MODEL statement of the $i$th response variable:

     state cycleState(dim) type=cycle  ...;
     component cycle_2 = cycleState[2];

Multivariate Season

The STATE statement option TYPE=SEASON(LENGTH=s) specifies a ((s–1)*dim)-dimensional $\pmb {\alpha }_{t}$, needed for defining a dim-dimensional trigonometric season component with season length s. A (multivariate) trigonometric season component, $\pmb {\gamma }$, is a sum of (multivariate) cycles of different frequencies,

\[  \pmb {\gamma } = \sum _{j = 1}^{[s/2]} \pmb {\gamma }_{j}  \]

where the constituent cycles $\pmb {\gamma }_{j}$, called harmonics, have frequencies $\lambda _ j = 2 \pi j/$s. All the harmonics are assumed to be statistically independent, have the same damping factor $\rho = 1$, and are governed by the disturbances with the same covariance matrix $\pmb {\Sigma }$. The number of harmonics, $[\Argument{s}/2]$, equals $\Argument{s}/2$ if $s$ is even and $(\Argument{s}-1)/2$ if it is odd. This means that specifying TYPE=SEASON(LENGTH=s) is equivalent to specifying $[\Argument{s}/2]$ cycle specifications with correct frequencies, damping factor $\rho = 1$, and the COV option restricted to the same covariance $\pmb {\Sigma }$. The resulting $\pmb {\alpha }_{t}$ is necessarily ((s–1)*dim)-dimensional. When the season length $\Argument{s}$ is even, the last harmonic cycle, $ \pmb {\gamma }_{\Argument{s}/2}$, has frequency $\pi $ and requires special attention. It is of dimension dim rather than 2*dim because its underlying state equation simplifies to a dim-variate autoregression with autoregression coefficient $-\mb {I}_{dim}$. As a result of this discussion, it is clear that the system matrices $ \mb {T}$ and $ \mb {Q}$ associated with the ((s–1)*dim)-dimensional $\pmb {\alpha }_{t}$ are block-diagonal with the blocks corresponding to the harmonics. The initial condition is fully diffuse.

For all the models discussed so far, the first dim elements of $\pmb {\alpha }_{t}$ provided the needed (multivariate) component. This is not the case for the (multivariate) season component. Extracting the $i$th seasonal component from $\pmb {\alpha }_{t}$ requires accumulating the contributions from the $[\Argument{s}/2]$ harmonics that are associated with this $i$th seasonal, which are not organized contiguously in $\pmb {\alpha }_{t}$. For example, suppose that dim is 2 and the season length s is 4. In this case $[\Argument{s}/2] $ is 2, and the bivariate seasonal component is a sum of two independent bivariate cycles, $ \pmb {\gamma }_{1}$ and $ \pmb {\gamma }_{2}$. The cycle $\pmb {\gamma }_{1}$ has frequency $\pi /2$ and its underlying state, say $\pmb {\alpha }_{t}^{a}$, has dimension $2 * dim = 4$. The last harmonic, $\pmb {\gamma }_{2}$, has frequency $\pi $, and therefore its underlying state, say $\pmb {\alpha }_{t}^{b}$, has dimension 2. The combined state $\pmb {\alpha }_{t} = ( \pmb {\alpha }_{t}^{a}, \pmb {\alpha }_{t}^{b} )$ has dimension $6 = 4 + 2$. In order to extract the first bivariate seasonal component, you must extract the first components of bivariate cycles $\pmb {\gamma }_{1}$ and $\pmb {\gamma }_{2}$, which in turn implies the first elements of $ \pmb {\alpha }_{t}^{a}$ and $ \pmb {\alpha }_{t}^{b}$, respectively. Thus, obtaining the first bivariate seasonal component requires extracting the first and the fifth elements of the combined state $ \pmb {\alpha }_{t}$. Similarly, obtaining the second bivariate seasonal component requires extracting the second and the sixth elements of the combined state $ \pmb {\alpha }_{t}$. All this can be summarized by the dot product expressions

$\displaystyle  s_{ 1 t}  $
$\displaystyle  =  $
$\displaystyle  ( 1 \;  0 \;  0 \;  0 \;  1 \;  0 ) \; \pmb {\alpha }_{t} \nonumber  $
$\displaystyle s_{2 t}  $
$\displaystyle  =  $
$\displaystyle  ( 0 \;  1 \;  0 \;  0 \;  0 \;  1 )\;  \pmb {\alpha }_{t} \nonumber  $

where $ s_{ 1 t}$ and $ s_{ 2 t}$ denote the first and second components, respectively, of the bivariate seasonal component. Note that $ s_{ 1 t}$ and $ s_{ 2 t}$ are univariate seasonal components, each of season length $4$, in their own right. They are correlated components; their correlation structure depends on $\pmb {\Sigma }$.

Obtaining the desired components of the multivariate seasonal component is made easy by a special syntax convention of the COMPONENT statement. Continuing with the previous example, the following examples illustrate two equivalent ways of obtaining $ s_{ 1 t}$ and $ s_{ 2 t}$. The first set of statements explicitly specify the linear combinations needed for defining $ s_{ 1 t}$ and $ s_{ 2 t}$:

     state seasonState(2) type=season(length=4)  ...;
     component s_1 =( 1  0  0  0  1  0 ) * seasonState;
     component s_2 =( 0  1  0  0  0  1 ) * seasonState;

The following simpler specification achieves the same result:

     state seasonState(2) type=season(length=4)  ...;
     component s_1 = seasonState[1];
     component s_2 = seasonState[2];

In the latter specification, the meaning of the element operator [] changes if the state in question is defined by using the TYPE= option.

Multivariate ARMA

You can specify a state vector that follows a multivariate autoregressive, moving average (VARMA) model by using the STATE statement option TYPE=VARMA. The autoregressive and moving average orders can be either 0 or 1 ($0 \leq p \leq 1$ and $0 \leq q \leq 1$ )—that is, only VAR(1), MA(1), and VARMA(1,1) models can be specified. The notation and the state space form of the VARMA model described here is taken from Reinsel (1997), which is a good reference for VARMA modeling.

A dim-dimensional vector process $\pmb {\gamma }_{t}$ follows a zero-mean, autoregressive order p, moving average order q (VARMA(p, q)) model if it satisfies the following matrix difference equation:

\[  \pmb {\gamma }_{t} = \sum _{i=1}^{p} \pmb {\Phi }_{i} \pmb {\gamma }_{t-i} + \pmb {\epsilon }_{t} - \sum _{j=1}^{q} \pmb {\Theta }_{j} \pmb {\epsilon }_{t-i}  \]

Here $ \pmb {\Phi }_{i}$ and $ \pmb {\Theta }_{j}$ are dim-dimensional square matrices and $ \pmb {\epsilon }_{t}$ is a dim-dimensional, Gaussian, white noise sequence with covariance matrix $\pmb {\Sigma }$. If autoregressive order p is 0, the term that involves $\pmb {\Phi }_{i}$ is absent; similarly, if the moving average order q is 0, the term that involves $ \pmb {\Theta }_{j}$ is absent. Since AR and MA orders can be at most 1, the subscripts of $ \pmb {\Phi }_{i}$ and $ \pmb {\Theta }_{j}$ can be ignored in this discussion—when applicable, an AR coefficient matrix is denoted by $ \pmb {\Phi }$ and an MA coefficient matrix is denoted by $ \pmb {\Theta }$. The unknown elements of $\pmb {\Phi }$, $ \pmb {\Theta }$, and $\pmb {\Sigma }$ constitute the parameter vector that is associated with a VARMA state. The process $\pmb {\gamma }_{t}$ defined by the VARMA difference equation is stationary and invertible (Reinsel 1997) if and only if the eigenvalues of $\pmb {\Phi }$ and $\pmb {\Theta }$ are strictly less than 1 in magnitude. By default, the SSM procedure imposes these stationarity and invertibility restrictions on $\pmb {\Phi }$ and $\pmb {\Theta }$. However, you can specify $\pmb {\Phi }$ to be an identity matrix, in which case the resulting process is nonstationary.

A VARMA model can be cast into a state space form. The state space form used by the SSM procedure is described in Reinsel (1997, pp 52–53). The system matrices for the supported VARMA models are as follows:

  • The VAR(1) form is the simplest. In this case, the underlying state $\pmb {\alpha }_{t}$ is the same as the VAR(1) process $\pmb {\gamma }_{t}$. Therefore, $\mb {T} = \pmb {\Phi }$ and $\mb {Q_ t} = \pmb {\Sigma }$.

  • Taking $\pmb {\Phi }$ equal to the zero matrix if $p=0$, the VARMA(1,1) and MA(1) cases can be treated together. In this case, the underlying state $\pmb {\alpha }_{t}$ is 2*dim dimensional and the desired VARMA process $\pmb {\gamma }_{t}$ corresponds to its first dim elements. Let $\pmb {\Psi } = \pmb {\Phi } - \pmb {\Theta } $. Then, in the blocked form,

    \[  \mb {T} = \left[ \begin{matrix}  \mb {0}   &  \mb {I}_{dim}   \\ \mb {0}   &  \pmb {\Phi }   \\ \end{matrix} \right] \; \; \; \;  \text {and} \; \; \; \;  \mb {Q}_{t} = \mb {Q} = \left[ \begin{matrix}  \pmb {\Sigma }   &  \pmb {\Sigma } \pmb {\Psi }^{}  \\ \pmb {\Psi } \pmb {\Sigma }   &  \pmb {\Psi } \pmb {\Sigma } \pmb {\Psi }^{}   \\ \end{matrix} \right]  \]

Unless $ \pmb {\Phi }$ is restricted to be identity, the underlying state $\pmb {\alpha }_{t}$ is stationary and the covariance of the initial condition is computed by

\[  \mi {vec}(\mb {Q}_{1}) = (\mb {I} - \mb {T}\bigotimes \mb {T})^{-1} \mi {vec}(\mb {Q})  \]

where $\bigotimes $ denotes the Kronecker product and the $\mi {vec}$ operation on a matrix creates a vector formed by vertically stacking the rows of that matrix. If $ \pmb {\Phi }$ is restricted to be identity, the initial condition is fully diffuse.

Continuous-Time Cycle

The STATE statement option TYPE=CYCLE(CT) specifies a two-dimensional $\pmb {\alpha }_{t}$, needed for defining a univariate continuous time cycle. In this case the nominal dimension, dim, must be 1. In particular, $\pmb {\Sigma }$ becomes one-dimensional, which is denoted by $\sigma ^{2}$. This cycle can be used for any data type. As before, the parameters of the cycle are a damping factor $\rho $, $0 < \rho \leq 1$, and period $ > 0$. Unlike in the discrete-time cycle described in the section Multivariate Cycle, the period is not required to be larger than 2. Let $\lambda = 2\pi /\Argument{period}$, and let $h_ t = (\tau _{t+1} - \tau _{t})$ denote the difference between successive time points. In this case, the system matrices $\mb {T}$ and $\mb {Q}$ that govern $\pmb {\alpha }_{t}$ depend on $h_ t$. They are:

$\displaystyle  \mb {T}  $
$\displaystyle  =  $
$\displaystyle  \rho ^{h} \left( \cos (\lambda h) \; \sin (\lambda h), \;  -\sin (\lambda h) \; \cos (\lambda h) \right) \nonumber  $
$\displaystyle \mb {Q}  $
$\displaystyle  =  $
$\displaystyle  \frac{\sigma ^{2}(1 - \rho ^{2h})}{-2\ln (\rho )}*\mb {I}_{2} \qquad \text {if}\;  \rho < 1 \nonumber  $
$\displaystyle \mb {Q}  $
$\displaystyle  =  $
$\displaystyle  \sigma ^{2}h\mb {I}_{2} \qquad \text {if}\;  \rho = 1 \nonumber  $

If $\rho < 1$, the initial condition is nondiffuse: $\mb {Q}_{1} = \frac{\sigma ^{2}}{-2\ln (\rho )}\mb {I}_{2}$. For $\rho = 1$, the initial condition is fully diffuse.

The first element of $\pmb {\alpha }_{t}$ corresponds to the needed cycle, and the second element is an auxiliary quantity. You can define a cycle term based on this state as follows:

     state cycleState(1) type=cycle(CT)  ...;
     component cycle = cycleState[1];

The CT option must be included in the use of TYPE=CYCLE.