The AUTOREG Procedure

Predicted Values

The AUTOREG procedure can produce two kinds of predicted values for the response series and corresponding residuals and confidence limits. The residuals in both cases are computed as the actual value minus the predicted value. In addition, when GARCH models are estimated, the AUTOREG procedure can output predictions of the conditional error variance.

Predicting the Unconditional Mean

The first type of predicted value is obtained from only the structural part of the model, ${\mb {x} _{t}’\mb {b} }$. These are useful in predicting values of new response time series, which are assumed to be described by the same model as the current response time series. The predicted values, residuals, and upper and lower confidence limits for the structural predictions are requested by specifying the PREDICTEDM=, RESIDUALM=, UCLM=, or LCLM= option in the OUTPUT statement. The ALPHACLM= option controls the confidence level for UCLM= and LCLM=. These confidence limits are for estimation of the mean of the dependent variable, ${\mb {x} _{t}’\mb {b} }$, where ${\mb {x} _{t}}$ is the column vector of independent variables at observation t.

The predicted values are computed as

\[  \hat{y}_{t}=\mb {x} _{t}’\mb {b}  \]

and the upper and lower confidence limits as

\[  \hat{u}_{t}= \hat{y}_{t}+ t_{{\alpha }/2 }\mr {v}  \]
\[  \hat{l}_{t}= \hat{y}_{t}- t_{{\alpha }/2 }\mr {v}  \]

where v$^{2}$ is an estimate of the variance of ${\hat{y}_{t}}$ and ${t_{{\alpha }/2}}$ is the upper ${\alpha }$/2 percentage point of the t distribution.

\[  \mr {Prob}(T > t_{{\alpha }/2})={\alpha }/2  \]

where T is an observation from a t distribution with q degrees of freedom. The value of ${\alpha }$ can be set with the ALPHACLM= option. The degrees of freedom parameter, q, is taken to be the number of observations minus the number of free parameters in the final model. For the YW estimation method, the value of v is calculated as

\[  \mr {v}=\sqrt {s^{2}\mb {x} _{t}’ (\mb {X} ’\mb {V} ^{-1}\mb {X} )^{-1}\mb {x} _{t} }  \]

where ${s^{2}}$ is the error sum of squares divided by q. For the ULS and ML methods, it is calculated as

\[  \mr {v}=\sqrt { s^{2}\mb {x} _{t}’\mb {Wx} _{t} }  \]

where $\mb {W}$ is the ${k{\times }k }$ submatrix of ${(\mb {J} ’\mb {J} )^{-1 }}$ that corresponds to the regression parameters. For details, see the section Computational Methods earlier in this chapter.

Predicting Future Series Realizations

The other predicted values use both the structural part of the model and the predicted values of the error process. These conditional mean values are useful in predicting future values of the current response time series. The predicted values, residuals, and upper and lower confidence limits for future observations conditional on past values are requested by the PREDICTED=, RESIDUAL=, UCL=, or LCL= option in the OUTPUT statement. The ALPHACLI= option controls the confidence level for UCL= and LCL=. These confidence limits are for the predicted value,

\[  \tilde{y}_{t}=\mb {x} _{t}’\mb {b} + {\nu }_{t|t-1}  \]

where ${\mb {x} _{t}}$ is the vector of independent variables if all independent variables at time $t$ are nonmissing, and ${ {\nu }_{t|t-1}}$ is the minimum variance linear predictor of the error term, which is defined in the following recursive way given the autoregressive model, AR(m) model, for ${{\nu }_{t}}$:

\[  {\nu }_{s|t}=\left\{  \begin{array}{ l l } - \sum _{i=1}^{m}\hat{{\varphi }}_{i} {\nu }_{s-i|t} &  s>t\  \text {or observation}\  s\  \text {is missing} \\ {y}_{s}-\mb {x} _{s}’\mb {b} &  0<s\leq t\  \text {and observation}\  s\  \text {is nonmissing} \\ 0 &  s\leq 0 \end{array} \right.  \]

where ${ \hat{{\varphi }}_{i}, i= 1, {\ldots } , m}$, are the estimated AR parameters. Observation $s$ is considered to be missing if the dependent variable or at least one independent variable is missing. If some of the independent variables at time $t$ are missing, the predicted $\tilde{y}_{t}$ is also missing. With the same definition of ${\nu }_{s|t}$, the prediction method can be easily extended to the multistep forecast of $\tilde{y}_{t+d}, d>0$:

\[  \tilde{y}_{t+d}=\mb {x} _{t+d}’\mb {b} + {\nu }_{t+d|t-1}  \]

The prediction method is implemented through the Kalman filter.

If $\tilde{y}_{t}$ is not missing, the upper and lower confidence limits are computed as

\[  \tilde{u}_{t}= \tilde{y}_{t}+ t_{{\alpha }/2}\mr {v}  \]
\[  \tilde{l}_{t}= \tilde{y}_{t}- t_{{\alpha }/2}\mr {v}  \]

where v, in this case, is computed as

\[  \mr {v}=\sqrt {\mb {z}_{t}’ \mb {V}_{\beta } \mb {z}_{t}+s^{2}r}  \]

where $\mb {V}_{\beta }$ is the variance-covariance matrix of the estimation of regression parameter $\beta $; $\mb {z}_{t}$ is defined as

\[  \mb {z}_{t} = \mb {x}_{t} + \sum _{i=1}^{m} \hat{{\varphi }}_{i} \mb {x}_{t-i|t-1}  \]

and $\mb {x}_{s|t}$ is defined in a similar way as ${\nu }_{s|t}$:

\[  {\mb {x}}_{s|t}=\left\{  \begin{array}{ l l } - \sum _{i=1}^{m}\hat{{\varphi }}_{i} {\mb {x}}_{s-i|t} &  s>t\  \text {or observation}\  s\  \text {is missing} \\ \mb {x} _{s} &  0<s\leq t\  \text {and observation}\  s\  \text {is nonmissing} \\ 0 &  s\leq 0 \end{array} \right.  \]

The value ${s^{2}r}$ is the estimate of the conditional prediction error variance. At the start of the series, and after missing values, r is generally greater than 1. See the section Predicting the Conditional Variance for the computational details of r. The plot of residuals and confidence limits in Example 8.4 illustrates this behavior.

Except to adjust the degrees of freedom for the error sum of squares, the preceding formulas do not account for the fact that the autoregressive parameters are estimated. In particular, the confidence limits are likely to be somewhat too narrow. In large samples, this is probably not an important effect, but it might be appreciable in small samples. Refer to Harvey (1981) for some discussion of this problem for AR(1) models.

At the beginning of the series (the first m observations, where m is the value of the NLAG= option) and after missing values, these residuals do not match the residuals obtained by using OLS on the transformed variables. This is because, in these cases, the predicted noise values must be based on less than a complete set of past noise values and, thus, have larger variance. The GLS transformation for these observations includes a scale factor in addition to a linear combination of past values. Put another way, the $\mb {L} ^{-1}$ matrix defined in the section Computational Methods has the value 1 along the diagonal, except for the first m observations and after missing values.

Predicting the Conditional Variance

The GARCH process can be written

\[  {\epsilon }^{2}_{t} = {\omega } + \sum _{i=1}^{n}{( {\alpha }_{i}+ {\gamma }_{i}) {\epsilon }^{2}_{t-i}} - \sum _{j=1}^{p}{{\gamma }_{j} {\eta }_{t-j}} + {\eta }_{t}  \]

where ${ {\eta }_{t}= {\epsilon }^{2}_{t}- h_{t}}$ and ${n = \max (p,q)}$. This representation shows that the squared residual ${\epsilon }^{2}_{t}$ follows an ARMA$(n,p)$ process. Then for any ${d > 0}$, the conditional expectations are as follows:

\[  \mb {E} ( {\epsilon }^{2}_{t+d}| {\Psi }_{t}) = {\omega } + \sum _{i=1}^{n}{( {\alpha }_{i}+ {\gamma }_{i}) \mb {E} ( {\epsilon }^{2}_{t+d-i}| {\Psi }_{t})} - \sum _{j=1}^{p}{{\gamma }_{j}\mb {E} ( {\eta }_{t+d-j}| {\Psi }_{t})}  \]

The d-step-ahead prediction error, ${\xi }_{t+d}$ = $y_{t+d} - y_{t+d|t}$, has the conditional variance

\[  \mb {V} ( {\xi }_{t+d}| {\Psi }_{t}) = \sum _{j=0}^{d-1}{g^{2}_{j} {\sigma }^{2}_{t+d-j|t}}  \]

where

\[  {\sigma }^{2}_{t+d-j|t} = \mb {E} ( {\epsilon }^{2}_{t+d-j}| {\Psi }_{t})  \]

Coefficients in the conditional d-step prediction error variance are calculated recursively using the formula

\[  g_{j} = - {\varphi }_{1} g_{j-1} - {\ldots } - {\varphi }_{m} g_{j-m}  \]

where ${ g_{0}=1}$ and ${ g_{j}=0 }$ if ${j<0}$; ${\varphi }_{1}$, ${\ldots }$, ${\varphi }_{m}$ are autoregressive parameters. Since the parameters are not known, the conditional variance is computed using the estimated autoregressive parameters. The d-step-ahead prediction error variance is simplified when there are no autoregressive terms:

\[  \mb {V} ( {\xi }_{t+d}| {\Psi }_{t}) = {\sigma }^{2}_{t+d|t}  \]

Therefore, the one-step-ahead prediction error variance is equivalent to the conditional error variance defined in the GARCH process:

\[  h_{t} = \mb {E} ( {\epsilon }^{2}_{t}| {\Psi }_{t-1}) = {\sigma }^{2}_{t|t-1}  \]

The multistep forecast of conditional error variance of the EGARCH, QGARCH, TGARCH, PGARCH, and GARCH-M models cannot be calculated using the preceding formula for the GARCH model. The following formulas are recursively implemented to obtain the multistep forecast of conditional error variance of these models:

  • for the EGARCH(p, q) model:

    \[  {\ln }({\sigma }^{2}_{t+d|t}) = \omega + \sum _{i=d}^{q}{{\alpha }_{i}g( z_{t+d-i})} + \sum _{j=1}^{d-1}{{\gamma }_{j}{\ln }( {\sigma }^{2}_{t+d-j|t} )} + \sum _{j=d}^{p}{{\gamma }_{j}{\ln }( h_{t+d-j})}  \]

    where

    \[  g( z_{t}) = {\theta } z_{t}+{|z_{t}|}-{E}{|z_{t}|}  \]
    \[  z_{t} = {\epsilon }_{t}/\sqrt { h_{t}}  \]
  • for the QGARCH(p, q) model:

    $\displaystyle  {\sigma }^{2}_{t+d|t} = \omega  $
    $\displaystyle  +  $
    $\displaystyle  \sum _{i=1}^{d-1}{{\alpha }_{i} ({\sigma }^{2}_{t+d-i|t}+\psi _{i}^2)} + \sum _{i=d}^{q}{{\alpha }_{i} (\epsilon _{t+d-i}-\psi _{i})^2}  $
    $\displaystyle  $
    $\displaystyle  +  $
    $\displaystyle  \sum _{j=1}^{d-1}{{\gamma }_{j}{\sigma }^{2}_{t+d-j|t} } + \sum _{j=d}^{p}{{\gamma }_{j} h_{t+d-j}}  $

  • for the TGARCH(p, q) model:

    $\displaystyle  {\sigma }^{2}_{t+d|t} = \omega  $
    $\displaystyle  +  $
    $\displaystyle  \sum _{i=1}^{d-1}{({\alpha }_{i}+\psi _{i}/2) {\sigma }^{2}_{t+d-i|t}} + \sum _{i=d}^{q}{({\alpha }_{i} + 1_{{\epsilon }_{t+d-i}<0} {\psi }_{i}) {\epsilon }_{t+d-i}^{2}}  $
    $\displaystyle  $
    $\displaystyle  +  $
    $\displaystyle  \sum _{j=1}^{d-1}{{\gamma }_{j}{\sigma }^{2}_{t+d-j|t} } + \sum _{j=d}^{p}{{\gamma }_{j} h_{t+d-j}}  $

  • for the PGARCH(p, q) model:

    $\displaystyle  ({\sigma }^{2}_{t+d|t})^{\lambda } = \omega  $
    $\displaystyle  +  $
    $\displaystyle  \sum _{i=1}^{d-1}{{\alpha }_{i}((1+\psi _{i})^{2\lambda }+(1-\psi _{i})^{2\lambda })({\sigma }^{2}_{t+d-i|t})^{\lambda }/2}  $
    $\displaystyle  $
    $\displaystyle  +  $
    $\displaystyle  \sum _{i=d}^{q}{{\alpha }_{i} (|{\epsilon }_{t+d-i}|-{\psi }_{i}{\epsilon }_{t+d-i})^{2\lambda }}  $
    $\displaystyle  $
    $\displaystyle  +  $
    $\displaystyle  \sum _{j=1}^{d-1}{{\gamma }_{j}({\sigma }^{2}_{t+d-j|t})^{\lambda } } + \sum _{j=d}^{p}{{\gamma }_{j} h_{t+d-j}^{\lambda }}  $

  • for the GARCH-M model: ignoring the mean effect and directly using the formula of the corresponding GARCH model.

If the conditional error variance is homoscedastic, the conditional prediction error variance is identical to the unconditional prediction error variance

\[  \mb {V} ( {\xi }_{t+d}| {\Psi }_{t}) = \mb {V} ( {\xi }_{t+d}) = {\sigma }^{2}\sum _{j=0}^{d-1}{g^{2}_{j}}  \]

since ${ {\sigma }^{2}_{t+d-j|t} = {\sigma }^{2}}$. You can compute ${s^{2}r}$ (which is the second term of the variance for the predicted value ${\tilde{y}_{t}}$ explained in the section Predicting Future Series Realizations) by using the formula ${ {\sigma }^{2}\sum _{j=0}^{d-1}{g^{2}_{j}}}$, and r is estimated from ${\sum _{j=0}^{d-1}{g^{2}_{j}}}$ by using the estimated autoregressive parameters.

Consider the following conditional prediction error variance:

\[  \mb {V} ( {\xi }_{t+d}| {\Psi }_{t}) = {\sigma }^{2} \sum _{j=0}^{d-1}{g^{2}_{j}} + \sum _{j=0}^{d-1}{g^{2}_{j} ( {\sigma }^{2}_{t+d-j|t}} - {\sigma }^{2})  \]

The second term in the preceding equation can be interpreted as the noise from using the homoscedastic conditional variance when the errors follow the GARCH process. However, it is expected that if the GARCH process is covariance stationary, the difference between the conditional prediction error variance and the unconditional prediction error variance disappears as the forecast horizon d increases.