Introduction to Regression Procedures


Predicted and Residual Values

After the model has been fit, predicted and residual values are usually calculated, graphed, and output. The predicted values are calculated from the estimated regression equation; the raw residuals are calculated as the observed value minus the predicted value. Often other forms of residuals, such as studentized or cumulative residuals, are used for model diagnostics. Some procedures can calculate standard errors of residuals, predicted mean values, and individual predicted values.

Consider the ith observation, where $\mb{x}_ i’$ is the row of regressors, $\widehat{\bbeta }$ is the vector of parameter estimates, and $s^2$ is the estimate of the residual variance (the mean squared error). The leverage value of the ith observation is defined as

\[ h_ i = w_ i \mb{x}_ i’ (\bX ’\bW \bX )^{-1} \mb{x}_ i \]

where $\bX $ is the design matrix for the observed data, $\mb{x}_ i’$ is an arbitrary regressor vector (possibly but not necessarily a row of $\bX $), $\bW $ is a diagonal matrix of observed weights, and $w_ i$ is the weight corresponding to $\mb{x}_ i’$.

Then the predicted mean and the standard error of the predicted mean are

\begin{align*} \widehat{y}_ i = & \; \mb{x}_ i’\widehat{\bbeta } \\ \mbox{STDERR}(\widehat{y}_ i) = & \sqrt {s^2 h_ i / w_ i} \end{align*}

The standard error of the individual (future) predicted value $y_ i$ is

\[ \mr{STDERR}(y_ i) = \sqrt {s^2 (1 + h_ i) / w_ i} \]

If the predictor vector $\mb{x}_ i$ corresponds to an observation in the analysis data, then the raw residual for that observation and the standard error of the raw residual are defined as

\begin{align*} \mr{RESID}_ i = & \; y_ i - \mb{x}_ i’\widehat{\bbeta } \\ \mr{STDERR}(\mr{RESID}_ i) =& \sqrt {s^2 (1 - h_ i) / w_ i} \end{align*}

The studentized residual is the ratio of the raw residual and its estimated standard error. Symbolically,

\[ \mr{STUDENT}_ i = \frac{\mr{RESID}_ i}{\mr{STDERR}(\mr{RESID}_ i)} \]

Two types of intervals provide a measure of confidence for prediction: the confidence interval for the mean value of the response, and the prediction (or forecasting) interval for an individual observation. As discussed in the section Mean Squared Error in Chapter 3: Introduction to Statistical Modeling with SAS/STAT Software, both intervals are based on the mean squared error of predicting a target based on the result of the model fit. The difference in the expressions for the confidence interval and the prediction interval occurs because the target of estimation is a constant in the case of the confidence interval (the mean of an observation) and the target is a random variable in the case of the prediction interval (a new observation).

For example, you can construct a confidence interval for the ith observation that contains the true mean value of the response with probability $1 - \alpha $. The upper and lower limits of the confidence interval for the mean value are

\begin{align*} \mr{LowerM} =& \; \mb{x}_ i’ \widehat{\bbeta } - t_{\alpha /2,\nu } \sqrt {s^2 h_ i/w_ i} \\ \mr{UpperM} =& \; \mb{x}_ i’ \widehat{\bbeta } + t_{\alpha /2,\nu } \sqrt {s^2 h_ i/w_ i} \end{align*}

where $t_{\alpha /2,\nu }$ is the tabulated t quantile with degrees of freedom equal to the degrees of freedom for the mean squared error, $\nu = n-\mr{rank}(\bX )$.

The limits for the prediction interval for an individual response are

\begin{align*} \mr{LowerI} =& \; \mb{x}_ i’\widehat{\bbeta } - t_{\alpha /2,\nu } \sqrt {s^2(1+h_ i)/w_ i} \\ \mr{UpperI} =& \; \mb{x}_ i’\widehat{\bbeta } + t_{\alpha /2,\nu } \sqrt {s^2(1+h_ i)/w_ i} \end{align*}

Influential observations are those that, according to various criteria, appear to have a large influence on the analysis. One measure of influence, Cook’s D, measures the change to the estimates that results from deleting an observation,

\[ \mr{COOKD}_ i = \frac{1}{k} \mr{STUDENT}_ i^2 \left( \frac{\mr{STDERR}(\widehat{y}_ i)}{\mr{STDERR}(\mr{RESID}_ i)} \right)^2 \]

where k is the number of parameters in the model (including the intercept). For more information, see Cook (1977, 1979).

The predicted residual for observation i is defined as the residual for the ith observation that results from dropping the ith observation from the parameter estimates. The sum of squares of predicted residual errors is called the PRESS statistic:

\begin{align*} \mr{PRESID}_ i =& \; \frac{\mr{RESID}_ i}{1-h_ i} \\ \mr{PRESS} =& \sum _{i=1}^ n w_ i\mr{PRESID}_ i^2 \end{align*}