The RSREG Procedure

Computational Method

Canonical Analysis

For each response variable, the model can be written in the form

\[ y_ i = \mb{x}_ i^{\prime }\mb{A}\mb{x}_ i + \mb{b}^{\prime }\mb{x}_ i + \mb{c}^{\prime }\mb{z}_ i + \epsilon _ i \]

where

$y_ i$

is the ith observation of the response variable.

$\mb{x}_ i$

$=(x_{i1}, x_{i2}, \ldots , x_{ik})^{\prime }$ are the k factor variables for the ith observation.

$\mb{z}_ i$

$=(z_{i1}, z_{i2}, \ldots , z_{iL})^{\prime }$ are the L covariates, including the intercept term.

$\mb{A}$

is the $k \times k$ symmetrized matrix of quadratic parameters, with diagonal elements equal to the coefficients of the pure quadratic terms in the model and off-diagonal elements equal to half the coefficient of the corresponding crossproduct.

$\mb{b}$

is the $k \times 1$ vector of linear parameters.

$\mb{c}$

is the $L \times 1$ vector of covariate parameters, one of which is the intercept.

${\epsilon }_ i$

is the error associated with the ith observation. Tests performed by PROC RSREG assume that errors are independently and normally distributed with mean zero and variance $\sigma ^2$.

The parameters in $\mb{A}$, $\mb{b}$, and $\mb{c}$ are estimated by least squares. To optimize $\mb{y}$ with respect to $\mb{x}$, take partial derivatives, set them to zero, and solve:

\[ \frac{\partial y}{\partial \mb{x}}~ =~ 2\mb{x}^{\prime }\mb{A} + \mb{b}^{\prime } = \mb{0}~ ~ ~ \Longrightarrow ~ ~ ~ \mb{x}~ =~ -\frac{1}{2}\mb{A}^{-1}\mb{b} \]

You can determine if the solution is a maximum or minimum by looking at the eigenvalues of $\mb{A}$:

If the eigenvalues…

 

then the solution is…

are all negative

 

a maximum

are all positive

 

a minimum

have mixed signs

 

a saddle point

contain zeros

 

in a flat area

Ridge Analysis

If the largest eigenvalue is positive, its eigenvector gives the direction of steepest ascent from the stationary point; if the largest eigenvalue is negative, its eigenvector gives the direction of steepest descent. The eigenvectors corresponding to small or zero eigenvalues point in directions of relative flatness.

The point on the optimum response ridge at a given radius R from the ridge origin is found by optimizing

\[ (\mb{x}_0 + \mb{d})^{\prime }\mb{A}(\mb{x}_0 + \mb{d}) + \mb{b}^{\prime }(\mb{x}_0 + \mb{d}) \]

over $\mb{d}$ satisfying $\mb{d}^{\prime }\mb{d}=R^2$, where $\mb{x}_0$ is the $k \times 1$ vector containing the ridge origin and $\mb{A}$ and $\mb{b}$ are as previously discussed. By the method of Lagrange multipliers, the optimal $\mb{d}$ has the form

\[ \mb{d} = -(\mb{A} - \mu \mb{I})^{-1}(\mb{Ax}_0 + 0.5 \mb{b}) \]

where $\mb{I}$ is the $k \times k$ identity matrix and $\mu $ is chosen so that $\mb{d}^{\prime }\mb{d}=R^2$. There can be several values of $\mu $ that satisfy this constraint; the correct one depends on which sort of response ridge is of interest. If you are searching for the ridge of maximum response, then the appropriate $\mu $ is the unique one that satisfies the constraint and is greater than all the eigenvalues of $\mb{A}$. Similarly, the appropriate $\mu $ for the ridge of minimum response satisfies the constraint and is less than all the eigenvalues of $\mb{A}$. (See Myers and Montgomery (1995) for details.)