Predicted and Residual Values

After the model has been fit, predicted and residual values are usually calculated, graphed, and output. The predicted values are calculated from the estimated regression equation; the raw residuals are calculated as the observed minus the predicted value. Often other forms of residuals are used for model diagnostics, such as studentized or cumulative residuals. Some procedures can calculate standard errors of residuals, predicted mean values, and individual predicted values.

Consider the ith observation where $\mb {x}_ i’$ is the row of regressors, $\widehat{\bbeta }$ is the vector of parameter estimates, and $s^2$ is the estimate of the residual variance (the mean squared error). The leverage value of the ith observation is defined as

\[  h_ i = w_ i \mb {x}_ i’ (\bX ’\bW \bX )^{-1} \mb {x}_ i  \]

where $\bX $ is the design matrix for the observed data, $\mb {x}_ i’$ is an arbitrary regressor vector (possibly but not necessarily a row of $\bX $), $\bW $ is a diagonal matrix with the observed weights on the diagonal, and $w_ i$ is the weight corresponding to $\mb {x}_ i’$.

Then the predicted mean and the standard error of the predicted mean are

$\displaystyle  \widehat{y}_ i = $
$\displaystyle  \mb {x}_ i’\widehat{\bbeta }  $
$\displaystyle \mbox{STDERR}(\widehat{y}_ i) = $
$\displaystyle  \sqrt {s^2 h_ i / w_ i}  $

The standard error of the individual (future) predicted value $y_ i$ is

\[  \mr {STDERR}(y_ i) = \sqrt {s^2 (1 + h_ i) / w_ i}  \]

If the predictor vector $\mb {x}_ i$ corresponds to an observation in the analysis data, then the raw residual for that observation and the standard error of the raw residual are defined as

$\displaystyle  \mr {RESID}_ i = $
$\displaystyle  y_ i - \mb {x}_ i’\widehat{\bbeta }  $
$\displaystyle \mr {STDERR}(\mr {RESID}_ i) = $
$\displaystyle  \sqrt {s^2 (1 - h_ i) / w_ i}  $

The studentized residual is the ratio of the raw residual and its estimated standard error. Symbolically,

\[  \mr {STUDENT}_ i = \frac{\mr {RESID}_ i}{\mr {STDERR}(\mr {RESID}_ i)}  \]

There are two kinds of intervals involving predicted values that are associated with a measure of confidence: the confidence interval for the mean value of the response and the prediction (or forecasting) interval for an individual observation. As discussed in the section Mean Squared Error in Chapter 3: Introduction to Statistical Modeling with SAS/STAT Software, both intervals are based on the mean squared error of predicting a target based on the result of the model fit. The difference in the expressions for the confidence interval and the prediction interval comes about because the target of estimation is a constant in the case of the confidence interval (the mean of an observation) and the target is a random variable in the case of the prediction interval (a new observation).

For example, you can construct a confidence interval for the ith observation that contains the true mean value of the response with probability $1 - \alpha $. The upper and lower limits of the confidence interval for the mean value are

$\displaystyle  \mr {LowerM} = $
$\displaystyle  \mb {x}_ i’ \widehat{\bbeta } - t_{\alpha /2,\nu } \sqrt {s^2 h_ i/w_ i}  $
$\displaystyle \mr {UpperM} = $
$\displaystyle  \mb {x}_ i’ \widehat{\bbeta } + t_{\alpha /2,\nu } \sqrt {s^2 h_ i/w_ i}  $

where $t_{\alpha /2,\nu }$ is the tabulated t quantile with degrees of freedom equal to the degrees of freedom for the mean squared error, $\nu = n-\mr {rank}(\bX )$.

The limits for the prediction interval for an individual response are

$\displaystyle  \mr {LowerI} = $
$\displaystyle  \mb {x}_ i’\widehat{\bbeta } - t_{\alpha /2,\nu } \sqrt {s^2(1+h_ i)/w_ i}  $
$\displaystyle \mr {UpperI} = $
$\displaystyle  \mb {x}_ i’\widehat{\bbeta } + t_{\alpha /2,\nu } \sqrt {s^2(1+h_ i)/w_ i}  $

Influential observations are those that, according to various criteria, appear to have a large influence on the analysis. One measure of influence, Cook’s D, measures the change to the estimates that results from deleting an observation:

\[  \mr {COOKD}_ i = \frac{1}{k} \mr {STUDENT}_ i^2 \left( \frac{\mr {STDERR}(\widehat{y}_ i)}{\mr {STDERR}(\mr {RESID}_ i)} \right)^2  \]

where k is the number of parameters in the model (including the intercept). For more information, see Cook (1977, 1979).

The predicted residual for observation i is defined as the residual for the ith observation that results from dropping the ith observation from the parameter estimates. The sum of squares of predicted residual errors is called the PRESS statistic:

$\displaystyle  \mr {PRESID}_ i = $
$\displaystyle  \frac{\mr {RESID}_ i}{1-h_ i}  $
$\displaystyle \mr {PRESS} = $
$\displaystyle  \sum _{i=1}^ n w_ i\mr {PRESID}_ i^2  $