The CANDISC Procedure

Computational Details

Subsections:

General Formulas

Canonical discriminant analysis is equivalent to canonical correlation analysis between the quantitative variables and a set of dummy variables coded from the CLASS variable. In the following notation, the dummy variables are denoted by $\mb {y}$ and the quantitative variables are denoted by $\mb {x}$. The total sample covariance matrix for the $\mb {x}$ and $\mb {y}$ variables is

\[  \mb {S} = \left[\begin{matrix}  \mb {S}_{xx}   &  \mb {S}_{xy}   \cr \mb {S}_{yx}   &  \mb {S}_{yy}   \end{matrix}\right]  \]

When c is the number of groups, $n_ t$ is the number of observations in group t, and $\mb {S}_ t$ is the sample covariance matrix for the $\mb {x}$ variables in group t, the within-class pooled covariance matrix for the $\mb {x}$ variables is

\[  \mb {S}_ p = \frac{1}{\sum n_ t-c} {\sum (n_ t-1)\mb {S}_ t}  \]

The canonical correlations, $\rho _ i$, are the square roots of the eigenvalues, $\lambda _ i$, of the following matrix. The corresponding eigenvectors are $\mb {v}_ i$.

\[  {\mb {S}_ p}^{-1/2}\mb {S}_{xy}{\mb {S}_{yy}}^{-1}\mb {S}_{yx}{\mb {S}_ p}^{-1/2}  \]

Let $\mb {V}$ be the matrix that contains the eigenvectors $\mb {v}_ i$ that correspond to nonzero eigenvalues as columns. The raw canonical coefficients are calculated as follows:

\[  \mb {R} = {\mb {S}_ p}^{-1/2}\mb {V}  \]

The pooled within-class standardized canonical coefficients are

\[  \mb {P} = \mr {diag}(\mb {S}_ p)^{1/2}\mb {R}  \]

The total sample standardized canonical coefficients are

\[  \mb {T} = \mr {diag}(\mb {S}_{xx})^{1/2}\mb {R}  \]

Let $\mb {X}_ c$ be the matrix that contains the centered $\mb {x}$ variables as columns. The canonical scores can be calculated by any of the following:

\[  \mb {X}_ c \,  \mb {R}  \]
\[  \mb {X}_ c \,  \mr {diag}(\mb {S}_ p)^{-1/2}\mb {P}  \]
\[  \mb {X}_ c \,  \mr {diag}(\mb {S}_{xx})^{-1/2}\mb {T}  \]

For the multivariate tests based on $\mb {E}^{-1}\mb {H}$,

\[  \mb {E} = (n-1)(\mb {S}_{yy} - \mb {S}_{yx}\mb {S}_{xx}^{-1}\mb {S}_{xy})  \]
\[  \mb {H} = (n-1)\mb {S}_{yx}\mb {S}_{xx}^{-1}\mb {S}_{xy}  \]

where n is the total number of observations.