The VARMAX Procedure

Cointegration

This section briefly introduces the concepts of cointegration (Johansen, 1995a).

Definition 1.

(Engle and Granger, 1987): If a series $y_ t$ with no deterministic components can be represented by a stationary and invertible ARMA process after differencing d times, the series is integrated of order d, that is, $y_ t \sim I(d)$.

Definition 2.

(Engle and Granger, 1987): If all elements of the vector $\mb{y} _ t$ are $I(d)$ and there exists a cointegrating vector $\bbeta \neq 0$ such that $\bbeta ’\mb{y} _ t \sim I(d-b)$ for any $b > 0$, the vector process is said to be cointegrated $CI(d,b)$.

A simple example of a cointegrated process is the following bivariate system:

\begin{eqnarray*}  y_{1t} & =&  \gamma y_{2t} + \epsilon _{1t} \\ y_{2t} & =&  y_{2,t-1} + \epsilon _{2t} \end{eqnarray*}

with $ \epsilon _{1t}$ and $\epsilon _{2t}$ being uncorrelated white noise processes. In the second equation, $y_{2t}$ is a random walk, $\Delta y_{2t} = \epsilon _{2t}$, $\Delta \equiv 1-B$. Differencing the first equation results in

\[  \Delta y_{1t} = \gamma \Delta y_{2t} +\Delta \epsilon _{1t} = \gamma \epsilon _{2t} +\epsilon _{1t}-\epsilon _{1,t-1}  \]

Thus, both $y_{1t}$ and $y_{2t}$ are $I(1)$ processes, but the linear combination $ y_{1t} - \gamma y_{2t}$ is stationary. Hence $\mb{y} _ t =(y_{1t}, y_{2t})’$ is cointegrated with a cointegrating vector $\bbeta = (1, -\gamma )’$.

In general, if the vector process $\mb{y} _ t$ has k components, then there can be more than one cointegrating vector $\bbeta ’$. It is assumed that there are r linearly independent cointegrating vectors with $r<k$, which make the $k\times r$ matrix $\bbeta $. The rank of matrix $\bbeta $ is r, which is called the cointegration rank of $\mb{y} _ t$.

Common Trends

This section briefly discusses the implication of cointegration for the moving-average representation. Let $\mb{y} _ t$ be cointegrated $CI(1,1)$, then $\Delta \mb{y} _ t$ has the Wold representation:

\begin{eqnarray*}  \Delta \mb{y} _ t = \bdelta + \Psi (B)\bepsilon _ t \end{eqnarray*}

where $\bepsilon _ t$ is $iid (0,\Sigma )$, $\Psi (B)=\sum _{j=0}^{\infty } \Psi _ jB^ j$ with $\Psi _0=I_ k$, and $\sum _{j=0}^{\infty }j|\Psi _ j| < \infty $.

Assume that $\bepsilon _ t = 0$ if $t\leq 0$ and $\mb{y} _0$ is a nonrandom initial value. Then the difference equation implies that

\begin{eqnarray*}  \mb{y} _ t = \mb{y} _0 + \bdelta t + \Psi (1)\sum _{i=0}^{t}\bepsilon _ i + \Psi ^{*}(B)\bepsilon _ t \end{eqnarray*}

where $\Psi ^{*}(B) = (1-B)^{-1}(\Psi (B)-\Psi (1))$ and $\Psi ^{*}(B)$ is absolutely summable.

Assume that the rank of $\Psi (1)$ is $m=k-r$. When the process $\mb{y} _ t$ is cointegrated, there is a cointegrating $k\times r$ matrix $\bbeta $ such that $\bbeta ’ \mb{y} _ t$ is stationary.

Premultiplying $\mb{y} _ t$ by $\bbeta ’$ results in

\[  \bbeta ’ \mb{y} _ t = \bbeta ’\mb{y} _0 + \bbeta ’ \Psi ^{*}(B)\bepsilon _ t  \]

because $\bbeta ’\Psi (1)=0$ and $\bbeta ’\bdelta =0$.

Stock and Watson (1988) showed that the cointegrated process $\mb{y} _ t$ has a common trends representation derived from the moving-average representation. Since the rank of $\Psi (1)$ is $m=k-r$, there is a $k\times r$ matrix $H_1$ with rank r such that $\Psi (1)H_1=0$. Let $H_2$ be a $k\times m$ matrix with rank m such that $H_2’H_1=0$; then $A=C(1)H_2$ has rank m. The $H=(H_1,H_2)$ has rank k. By construction of H,

\begin{eqnarray*}  \Psi (1)H = [0, A] = A S_ m \end{eqnarray*}

where $S_ m=(0_{m\times r},I_ m)$. Since $\bbeta ’\Psi (1)=0$ and $\bbeta ’\bdelta =0$, $\bdelta $ lies in the column space of $\Psi (1)$ and can be written

\begin{eqnarray*}  \bdelta = C(1)\tilde{\bdelta } \end{eqnarray*}

where $\tilde{\bdelta }$ is a k-dimensional vector. The common trends representation is written as

\begin{eqnarray*}  \mb{y} _ t &  = &  \mb{y} _0 + \Psi (1)[\tilde{\bdelta } t + \sum _{i=0}^{t}\bepsilon _ i] + \Psi ^{*}(B)\bepsilon _ t \\ &  = &  \mb{y} _0 + \Psi (1)H[H^{-1}\tilde{\delta } t + H^{-1}\sum _{i=0}^{t}\bepsilon _ i] + \mb{a} _ t \\ &  = &  \mb{y} _0 + A\btau _ t + \mb{a} _ t \end{eqnarray*}

and

\[  \btau _ t = \pi + \btau _{t-1} + \mb{v} _ t  \]

where $\mb{a} _ t = \Psi ^{*}(B)\bepsilon _ t$, $\pi =S_ mH^{-1}\tilde{\bdelta }$, $\btau _ t= S_ m[H^{-1}\tilde{\bdelta } t + H^{-1}\sum _{i=0}^{t}\bepsilon _ i]$, and $\mb{v} _ t=S_ mH^{-1}\bepsilon _ t$.

Stock and Watson showed that the common trends representation expresses $\mb{y} _ t$ as a linear combination of m random walks ($\btau _ t$) with drift $\pi $ plus $I(0)$ components ($\mb{a} _ t)$.

Test for the Common Trends

Stock and Watson (1988) proposed statistics for common trends testing. The null hypothesis is that the k-dimensional time series $\mb{y} _{t}$ has m common stochastic trends, where $m\leq k$ and the alternative is that it has s common trends, where $s < m$ . The test procedure of m versus s common stochastic trends is performed based on the first-order serial correlation matrix of $\mb{y} _ t$. Let $\bbeta _{\bot }$ be a $k\times m$ matrix orthogonal to the cointegrating matrix such that $\bbeta _{\bot }^{'}\bbeta = 0$ and $\bbeta _{\bot }^{}\bbeta _{\bot }^{'}=I_ m$. Let $\mb{z} _{t}=\bbeta ’\mb{y} _ t$ and $\mb{w} _{t}=\bbeta _{\bot }^{'}\mb{y} _ t$. Then

\[  \mb{w} _{t} = \bbeta _{\bot }’\mb{y} _0 + \bbeta _{\bot }’\bdelta t + \bbeta _{\bot }’ \Psi (1)\sum _{i=0}^{t}\bepsilon _ i + \bbeta _{\bot }’ \Psi ^{*}(B)\bepsilon _ t  \]

Combining the expression of $\mb{z} _ t$ and $\mb{w} _ t$,

\begin{eqnarray*}  \left[ \begin{array}{c} \mb{z} _ t \\ \mb{w} _ t \end{array} \right] &  = &  \left[ \begin{array}{c} \bbeta ’\mb{y} _0 \\ \bbeta _{\bot }^{'}\mb{y} _0 \end{array} \right] + \left[ \begin{array}{c} 0 \\ \bbeta _{\bot }^{'}\bdelta \end{array} \right] t + \left[ \begin{array}{c} 0 \\ \bbeta _{\bot }^{'}\Psi (1) \end{array} \right] \sum _{i=1}^ t\bepsilon _ i \\ &  + &  \left[ \begin{array}{c} \bbeta ’\Psi ^{*}(B) \\ \bbeta _{\bot }’\Psi ^{*}(B) \end{array} \right] \bepsilon _ t \end{eqnarray*}

The Stock-Watson common trends test is performed based on the component $\mb{w} _ t$ by testing whether $\bbeta _{\bot }^{'}\Psi (1)$ has rank m against rank s.

The following statements perform the Stock-Watson test for common trends:

proc iml;
   sig = 100*i(2);
   phi = {-0.2 0.1, 0.5 0.2, 0.8 0.7, -0.4 0.6};
   call varmasim(y,phi) sigma=sig n=100 initial=0
                        seed=45876;
   cn = {'y1' 'y2'};
   create simul2 from y[colname=cn];
   append from y;
quit;

data simul2;
   set simul2;
   date = intnx( 'year', '01jan1900'd, _n_-1 );
   format date year4. ;
run;

proc varmax data=simul2;
   model y1 y2 / p=2 cointtest=(sw);
run;

In Figure 35.51, the first column is the null hypothesis that $\mb{y} _ t$ has $m\leq k$ common trends; the second column is the alternative hypothesis that $\mb{y} _ t$ has $s < m$ common trends; the third column contains the eigenvalues used for the test statistics; the fourth column contains the test statistics using AR(p) filtering of the data. The table shows the output of the case $p=2$.

Figure 35.51: Common Trends Test (COINTTEST=(SW) Option)

The VARMAX Procedure

Common Trend Test
H0:
Rank=m
H1:
Rank=s
Eigenvalue Filter 5% Critical Value Lag
1 0 1.000906 0.09 -14.10 2
2 0 0.996763 -0.32 -8.80  
  1 0.648908 -35.11 -23.00  



The test statistic for testing for 2 versus 1 common trends is more negative (–35.1) than the critical value (–23.0). Therefore, the test rejects the null hypothesis, which means that the series has a single common trend.