The CANDISC Procedure

Computational Details


General Formulas

Canonical discriminant analysis is equivalent to canonical correlation analysis between the quantitative variables and a set of dummy variables coded from the CLASS variable. In the following notation, the dummy variables are denoted by $\mb{y}$ and the quantitative variables are denoted by $\mb{x}$. The total sample covariance matrix for the $\mb{x}$ and $\mb{y}$ variables is

\[  \mb{S} = \left[\begin{matrix}  \mb{S}_{xx}   &  \mb{S}_{xy}   \cr \mb{S}_{yx}   &  \mb{S}_{yy}   \end{matrix}\right]  \]

When c is the number of groups, $n_ t$ is the number of observations in group t, and $\mb{S}_ t$ is the sample covariance matrix for the $\mb{x}$ variables in group t, the within-class pooled covariance matrix for the $\mb{x}$ variables is

\[  \mb{S}_ p = \frac{1}{\sum n_ t-c} {\sum (n_ t-1)\mb{S}_ t}  \]

The canonical correlations, $\rho _ i$, are the square roots of the eigenvalues, $\lambda _ i$, of the following matrix. The corresponding eigenvectors are $\mb{v}_ i$.

\[  {\mb{S}_ p}^{-1/2}\mb{S}_{xy}{\mb{S}_{yy}}^{-1}\mb{S}_{yx}{\mb{S}_ p}^{-1/2}  \]

Let $\mb{V}$ be the matrix that contains the eigenvectors $\mb{v}_ i$ that correspond to nonzero eigenvalues as columns. The raw canonical coefficients are calculated as follows:

\[  \mb{R} = {\mb{S}_ p}^{-1/2}\mb{V}  \]

The pooled within-class standardized canonical coefficients are

\[  \mb{P} = \mr{diag}(\mb{S}_ p)^{1/2}\mb{R}  \]

The total sample standardized canonical coefficients are

\[  \mb{T} = \mr{diag}(\mb{S}_{xx})^{1/2}\mb{R}  \]

Let $\mb{X}_ c$ be the matrix that contains the centered $\mb{x}$ variables as columns. The canonical scores can be calculated by any of the following:

\[  \mb{X}_ c \,  \mb{R}  \]
\[  \mb{X}_ c \,  \mr{diag}(\mb{S}_ p)^{-1/2}\mb{P}  \]
\[  \mb{X}_ c \,  \mr{diag}(\mb{S}_{xx})^{-1/2}\mb{T}  \]

For the multivariate tests based on $\mb{E}^{-1}\mb{H}$,

\[  \mb{E} = (n-1)(\mb{S}_{yy} - \mb{S}_{yx}\mb{S}_{xx}^{-1}\mb{S}_{xy})  \]
\[  \mb{H} = (n-1)\mb{S}_{yx}\mb{S}_{xx}^{-1}\mb{S}_{xy}  \]

where n is the total number of observations.