Computational Resources :: SAS/STAT(R) 12.1 User's Guide

Computational Resources

Memory
Time

Let

$\displaystyle n$	$\displaystyle =$	$\displaystyle \mbox{number of observations}$
$\displaystyle v$	$\displaystyle =$	$\displaystyle \mbox{number of variables}$
$\displaystyle c$	$\displaystyle =$	$\displaystyle \mbox{number of clusters}$
$\displaystyle p$	$\displaystyle =$	$\displaystyle \mbox{number of passes over the data set}$

Memory

The memory required is approximately $4(19v + 12cv + 10c + 2 \max (c+ 1, v))$ bytes.

If you request the DISTANCE option, an additional bytes of space is needed.

Time

The overall time required by PROC FASTCLUS is roughly proportional to if c is small with respect to n.

Initial seed selection requires one pass over the data set. If the observations are in random order, the time required is roughly proportional to

$nvc + vc^2$

unless you specify REPLACE=NONE. In that case, a complete pass might not be necessary, and the time is roughly proportional to , where $c \leq m \leq n$ .

The DRIFT option, each iteration, and the final assignment of cluster seeds each require one pass, with time for each pass roughly proportional to .

For greatest efficiency, you should list the variables in the VAR statement in order of decreasing variance.

The FASTCLUS Procedure