The VARMAX Procedure

Vector Error Correction Model

Subsections:

A vector error correction model (VECM) can lead to a better understanding of the nature of any nonstationarity among the different component series and can also improve longer-term forecasting compared to an unconstrained model.

The VECM(p) form with the cointegration rank, $r(\leq k)$, is written as

\begin{eqnarray*} \Delta \mb{y} _{t} = \bdelta + \Pi \mb{y} _{t-1} + \sum _{i=1}^{p-1}\Phi ^*_ i \Delta \mb{y} _{t-i} + \bepsilon _ t \end{eqnarray*}

where $\Delta $ is the differencing operator, such that $\Delta \mb{y} _{t} = \mb{y} _{t}-\mb{y} _{t-1}$; $\Pi = \alpha \beta ’$, where $\alpha $ and $\beta $ are $k\times r$ matrices; and $\Phi ^*_ i$ is a $k\times k$ matrix.

The VECM(p) form has an equivalent VAR(p) representation as described in the section Vector Autoregressive Model.

\begin{eqnarray*} \mb{y} _{t} = \bdelta + (I_ k+\Pi +\Phi ^*_1) \mb{y} _{t-1} +\sum _{i=2}^{p-1} (\Phi ^*_ i-\Phi ^*_{i-1}) \mb{y} _{t-i} -\Phi ^*_{p-1}\mb{y} _{t-p} + \bepsilon _ t \end{eqnarray*}

where $I_ k$ is a $k\times k$ identity matrix.

An example of the second-order nonstationary vector autoregressive model is

\begin{eqnarray*} \mb{y} _ t = \left( \begin{array}{rr} -0.2 & 0.1 \\ 0.5 & 0.2 \\ \end{array} \right) \mb{y} _{t-1} + \left( \begin{array}{rr} 0.8 & 0.7 \\ -0.4 & 0.6 \\ \end{array} \right) \mb{y} _{t-2} + \bepsilon _ t \end{eqnarray*}

with

\begin{eqnarray*} \Sigma = \left( \begin{array}{rr} 100 & 0 \\ 0 & 100 \\ \end{array} \right) ~ ~ \mr{and} ~ ~ \mb{y} _{-1} = \mb{y} _0 = \left( \begin{array}{r} 0 \\ 0 \\ \end{array} \right) \end{eqnarray*}

This process can be given the following VECM(2) representation with the cointegration rank one:

\begin{eqnarray*} \Delta \mb{y} _ t = \left( \begin{array}{r} -0.4 \\ 0.1 \\ \end{array} \right) ( 1, -2 ) \mb{y} _{t-1} - \left( \begin{array}{rr} 0.8 & 0.7 \\ -0.4 & 0.6 \\ \end{array} \right) \Delta \mb{y} _{t-1} + \bepsilon _ t \end{eqnarray*}

The following PROC IML statements generate simulated data for this VECM(2) form and the PROC SGPLOT statements plot the data, as shown in Figure 42.12:

proc iml;
   sig = 100*i(2);
   phi = {-0.2 0.1, 0.5 0.2, 0.8 0.7, -0.4 0.6};
   call varmasim(y,phi) sigma=sig n=100 initial=0
                        seed=45876;
   cn = {'y1' 'y2'};
   create simul2 from y[colname=cn];
   append from y;
quit;

data simul2;
   set simul2;
   date = intnx( 'year', '01jan1900'd, _n_-1 );
   format date year4. ;
run;
proc sgplot data=simul2;
   series x=date y=y1 / lineattrs=(pattern=solid);
   series x=date y=y2 / lineattrs=(pattern=dash);
   yaxis label="Series";
run;

Figure 42.12: Plot of Generated Data Process

Plot of Generated Data Process


Cointegration Testing

The following statements use the Johansen cointegration rank test. The COINTTEST=(JOHANSEN) option performs the Johansen trace test and is equivalent to specifying the COINTTEST option with no additional suboptions or specifying the COINTTEST=(JOHANSEN=(TYPE=TRACE)) option.

/*--- Cointegration Test ---*/

proc varmax data=simul2;
   model y1 y2 / p=2 noint dftest cointtest=(johansen);
run;

Figure 42.13 shows the output for Dickey-Fuller tests for the nonstationarity of each series and the Johansen cointegration rank test between series.

Figure 42.13: Dickey-Fuller Tests and Cointegration Rank Test

The VARMAX Procedure

Unit Root Test
Variable Type Rho Pr < Rho Tau Pr < Tau
y1 Zero Mean 1.47 0.9628 1.65 0.9755
  Single Mean -0.80 0.9016 -0.47 0.8916
  Trend -10.88 0.3573 -2.20 0.4815
y2 Zero Mean -0.05 0.6692 -0.03 0.6707
  Single Mean -6.03 0.3358 -1.72 0.4204
  Trend -50.49 0.0003 -4.92 0.0006

Cointegration Rank Test Using Trace
H0:
Rank=r
H1:
Rank>r
Eigenvalue Trace Pr > Trace Drift in ECM Drift in Process
0 0 0.5086 70.7279 <.0001 NOINT Constant
1 1 0.0111 1.0921 0.3441    



In Dickey-Fuller tests, the second column specifies three types of models, which are zero mean, single mean, or trend. The third column (Rho) and the fifth column (Tau) are the test statistics that are used to test the null hypothesis that the series has a unit root. Other columns are their p-values. You can see that both series have unit roots. For a description of Dickey-Fuller tests, see the section PROBDF Function for Dickey-Fuller Tests in Chapter 6: SAS Macros and Functions.

In the "Cointegration Rank Test Using Trace" table, the last two columns explain the drift in the model or process. Because the NOINT option is specified, the model is

\begin{eqnarray*} \Delta \mb{y} _{t} = \Pi \mb{y} _{t-1} + \Phi ^*_1 \Delta \mb{y} _{t-1} + \bepsilon _ t \end{eqnarray*}

The column Drift in ECM indicates that there is no separate drift in the error correction model, and the column Drift in Process indicates that the process has a constant drift before differencing.

H0 is the null hypothesis, and H1 is the alternative hypothesis. The first row tests the cointegration rank $r=0$ against $r>0$, and the second row tests $r=1$ against $r>1$. The trace test statistics in the fourth column are computed by $-T\sum _{i=r+1}^ k \log (1-\lambda _ i)$, where T is the available number of observations and $\lambda _ i$ is the eigenvalue in the third column. The p-values for these statistics are output in the fifth column. If you compare the p-value in each row to the significance level of interest (such as 5%), the null hypothesis that there is no cointegrated process (H0: $r=0$) is rejected, whereas the null hypothesis that there is at most one cointegrated process (H0: $r=1$) cannot be rejected.

The following statements fit a VECM(2) form to the simulated data:

/*--- Vector Error Correction Model ---*/

proc varmax data=simul2;
   model y1 y2 / p=2 noint lagmax=3
                 print=(iarr estimates);
   cointeg rank=1 normalize=y1;
run;

The results in Figure 42.13 indicate that the time series are cointegrated with rank = 1. So you might want to specify the RANK=1 option in the COINTEG statement. For normalizing the value of the cointegrated vector, you specify the normalized variable by using the NORMALIZE= option in the COINTEG statement. The COINTEG statement produces the estimates of the long-run parameter, $\bbeta $, and the adjustment coefficient, $\balpha $. The PRINT=(IARR) option provides the VAR(2) representation.

The VARMAX procedure output is shown in Figure 42.14 through Figure 42.17. In Figure 42.14, "1" indicates the first column of the $\balpha $ and $\bbeta $ matrices. Because the cointegration rank is 1 in the bivariate system, $\balpha $ and $\bbeta $ are two-dimensional vectors. The estimated cointegrating vector is $\hat{\bbeta }=(1, -1.96)’$. Therefore, the long-run relationship between $y_{1t}$ and $y_{2t}$ is $y_{1t}=1.96y_{2t}$. The first element of $\hat{\bbeta }$ is 1 because $y_1$ is specified as the normalized variable. Asymptotically, $\balpha $ follows a normal distribution, and the t values and p-values of its elements are shown in the "Alpha and Beta Parameter Estimates" table; however, because $\bbeta $ follows a nonnormal distribution, the corresponding standard errors, t values, and p-values are missing. The Variable column shows the variables that correspond to the coefficients. For example, for the coefficient $\balpha _{ij}$ (the ith element in the jth column of $\balpha $), ALPHA$i\_ j$, the variable is the inner product of the transpose of the jth column of $\bbeta $ (Beta[,j]$’$) and the vector of lag 1 dependent variables $\mb{y} _{t-1}$ ($\_ $DEP$\_ $(t–1)).

Figure 42.14: Parameter Estimates for the VECM(2) Form

The VARMAX Procedure

Type of Model VECM(2)
Estimation Method Maximum Likelihood Estimation
Cointegrated Rank 1

Beta
Variable 1
y1 1.00000
y2 -1.95575

Alpha
Variable 1
y1 -0.46680
y2 0.10667

Alpha and Beta Parameter Estimates
Equation Parameter Estimate Standard
Error
t Value Pr > |t| Variable
D_y1 ALPHA1_1 -0.46680 0.04786 -9.75 <.0001 Beta[,1]'*_DEP_(t-1)
  BETA1_1 1.00000       y1(t-1)
D_y2 ALPHA2_1 0.10667 0.05146 2.07 0.0409 Beta[,1]'*_DEP_(t-1)
  BETA2_1 -1.95575       y2(t-1)



Figure 42.15 shows the parameter estimates in terms of lag 1 coefficients, $\mb{y} _{t-1}$, and lag 1 first-differenced coefficients, $\Delta \mb{y} _{t-1}$, and their significance. "Alpha * Beta$’$" indicates the coefficients of $\mb{y} _{t-1}$ and is obtained by multiplying the Alpha and Beta estimates in Figure 42.14. The parameter AR1$\_ i\_ j$ (which is shown in the "Model Parameter Estimates" table) corresponds to the elements in the "Alpha * Beta$’$" matrix. The parameter AR2$\_ i\_ j$ corresponds to the elements in the differenced lagged AR coefficient matrix. The "D_" prefixed to a variable name in Figure 42.15 implies differencing.

Figure 42.15: Parameter Estimates for the VECM(2) Form, Continued

Parameter Alpha * Beta' Estimates
Variable y1 y2
y1 -0.46680 0.91295
y2 0.10667 -0.20862

AR Coefficients of Differenced Lag
DIF Lag Variable y1 y2
1 y1 -0.74332 -0.74621
  y2 0.40493 -0.57157

Model Parameter Estimates
Equation Parameter Estimate Standard
Error
t Value Pr > |t| Variable
D_y1 AR1_1_1 -0.46680 0.04786 -9.75 <.0001 y1(t-1)
  AR1_1_2 0.91295 0.09359 9.75 <.0001 y2(t-1)
  AR2_1_1 -0.74332 0.04526 -16.42 <.0001 D_y1(t-1)
  AR2_1_2 -0.74621 0.04769 -15.65 <.0001 D_y2(t-1)
D_y2 AR1_2_1 0.10667 0.05146 2.07 0.0409 y1(t-1)
  AR1_2_2 -0.20862 0.10064 -2.07 0.0409 y2(t-1)
  AR2_2_1 0.40493 0.04867 8.32 <.0001 D_y1(t-1)
  AR2_2_2 -0.57157 0.05128 -11.15 <.0001 D_y2(t-1)



Figure 42.16 shows the parameter estimates of the innovations covariance matrix and their significance.

Figure 42.16: Parameter Estimates for the VECM(2) Form, Continued

Covariance Parameter Estimates
Parameter Estimate Standard
Error
t Value Pr > |t|
COV1_1 94.75575 13.53654 7.00 <.0001
COV1_2 4.52684 10.30302 0.44 0.6614
COV2_2 109.57038 15.65291 7.00 <.0001



The fitted model is represented as

\begin{eqnarray*} {\Delta \mb{y} }_ t = \left( \begin{array}{rr} -0.467 & 0.913 \\ (0.048) & (0.094)\\ 0.107 & -0.209 \\ (0.051) & (0.100)\\ \end{array} \right) \mb{y} _{t-1} + \left( \begin{array}{rr} -0.743 & -0.746 \\ (0.045)& (0.048) \\ 0.405 & -0.572 \\ (0.049) & (0.051)\\ \end{array} \right) \Delta \mb{y} _{t-1} + \bepsilon _ t \end{eqnarray*}

Figure 42.17: Change the VECM(2) Form to the VAR(2) Model

Infinite Order AR Representation
Lag Variable y1 y2
1 y1 -0.21013 0.16674
  y2 0.51160 0.21980
2 y1 0.74332 0.74621
  y2 -0.40493 0.57157
3 y1 0.00000 0.00000
  y2 0.00000 0.00000



The PRINT=(IARR) option in the previous SAS statements prints the reparameterized coefficient estimates. Because LAGMAX=3 in those statements, the coefficient matrix of lag 3 is zero.

The VECM(2) form in Figure 42.17 can be rewritten as the following second-order vector autoregressive model:

\begin{eqnarray*} \mb{y} _ t = \left( \begin{array}{rr} -0.210 & 0.167 \\ 0.512 & 0.220 \end{array} \right) \mb{y} _{t-1} + \left( \begin{array}{rr} 0.743 & 0.746 \\ -0.405 & 0.572 \end{array} \right) \mb{y} _{t-2} + \bepsilon _ t \end{eqnarray*}