The CANDISC Procedure

Computational Resources

In the following discussion, let

$\displaystyle  n  $
$\displaystyle  =  $
$\displaystyle  \mbox{number of observations}  $
$\displaystyle c  $
$\displaystyle  =  $
$\displaystyle  \mbox{number of class levels}  $
$\displaystyle v  $
$\displaystyle  =  $
$\displaystyle  \mbox{number of variables in the VAR list}  $
$\displaystyle l  $
$\displaystyle  =  $
$\displaystyle  \mbox{length of the CLASS variable}  $

Memory Requirements

The amount of memory in bytes for temporary storage needed to process the data is

\[  c(4v^2 + 28v + 4l + 68) + 16v^2 + 96v + 4l  \]

With the ANOVA option, the temporary storage must be increased by 16v bytes. The DISTANCE option requires an additional temporary storage of $4v^2+4v$ bytes.

Time Requirements

The following factors determine the time requirements of the CANDISC procedure:

  • The time needed for reading the data and computing covariance matrices is proportional to $nv^2$. PROC CANDISC must also look up each class level in the list. This is faster if the data are sorted by the CLASS variable. The time for looking up class levels is proportional to a value ranging from n to $n\log (c)$.

  • The time for inverting a covariance matrix is proportional to $v^3$.

  • The time required for the canonical discriminant analysis is proportional to $v^3$.

Each of the preceding factors has a different constant of proportionality.