The REG Procedure

Computations for Ridge Regression and IPC Analysis

In ridge regression analysis, the crossproduct matrix for the independent variables is centered (the NOINT option is ignored if it is specified) and scaled to one on the diagonal elements. The ridge constant k (specified with the RIDGE= option) is then added to each diagonal element of the crossproduct matrix. The ridge regression estimates are the least squares estimates obtained by using the new crossproduct matrix.

Let X be an $n \times p$ matrix of the independent variables after centering the data, and let Y be an $n \times 1$ vector corresponding to the dependent variable. Let D be a $p \times p$ diagonal matrix with diagonal elements as in $\mb {X}’\mb {X}$ . The ridge regression estimate corresponding to the ridge constant k can be computed as

$\mb {D}^{ - {\frac{1}{2}} } (\mb {Z}’\mb {Z} + k\mb {I}_{p})^{-1} \mb {Z}’\mb {Y}$

where $\mb {Z} = \mb {X D}^{- {\frac{1}{2}} }$ and $\mb {I}_{p}\$ is a $p{\times }p\$ identity matrix.

For IPC analysis, the smallest m eigenvalues of $\mb {Z}’\mb {Z}$ (where m is specified with the PCOMIT= option) are omitted to form the estimates.

For information about ridge regression and IPC standardized parameter estimates, parameter estimate standard errors, and variance inflation factors, see Rawlings, Pantula, and Dickey (1998); Neter, Wasserman, and Kutner (1990); Marquardt and Snee (1975). Unlike Rawlings, Pantula, and Dickey (1998), the REG procedure uses the mean squared errors of the submodels instead of the full model MSE to compute the standard errors of the parameter estimates.