- performs a disjoint cluster analysis on the basis of distances computed from one or more quantitative variables
- observations are divided into clusters such that every observation belongs to one and only one cluster
- uses Euclidean distances, so the cluster centers are based on least squares estimation (k-means model)
- designed to find good clusters (but not necessarily the best possible clusters) with only two or three
passes through the data set
- can be an effective procedure for detecting outliers because outliers often appear as clusters with only one member
- can use an Lp (least pth powers) clustering criterion
- is intended for use with large data sets, with 100 or more observations
- uses algorithms that place a larger influence on variables with larger variance
- produces brief summaries of the clusters
- produces an output data set containing a cluster membership variable
- obtain separate analysis on observations in groups
- compute weighted cluster means
- uses ODS to create a SAS data set corresponding to any table
For further details see the SAS/STAT User's Guide:
The FASTCLUS Procedure
( PDF | HTML )
Examples
Statistics and Operations Research Home Page | SAS/STAT Software