PROC VARCLUS
<options> ;
The PROC VARCLUS statement invokes the VARCLUS procedure. By default, VARCLUS clusters the numeric variables in the most recently created SAS data set, starting with one cluster and splitting clusters until all clusters have at most one eigenvalue greater than one.
Table 104.1 summarizes the options available in the PROC VARCLUS statement.
Table 104.1: Options Available in the PROC VARCLUS Statement
Option 
Description 

Data Sets 

Specifies the input SAS data set 

Specifies the output SAS data set to contain statistics 

Specifies the output SAS data set for use with PROC TREE 

Input Data Processing 

Uses the covariance matrix instead of the correlation matrix 

Omits the intercept 

Specifies the divisor for variances 

Number of Clusters 

Specifies the maximum number of clusters 

Specifies the minimum number of clusters 

Specifies the maximum second eigenvalue in a cluster 

Specifies the minimum proportion of variance explained by a cluster component 

Clustering Methods 

Uses centroid components instead of principal components 

Clusters hierarchically 

Specifies the initialization method 

Specifies the maximum iterations during the alternating least squares phase 

Specifies the maximum iterations during the search phase 

Performs a multiple group component analysis 

Specifies the random number seed 

Control Displayed Output 

Displays the correlation matrix 

Suppresses displayed output 

Specifies ODS Graphics details 

Suppresses display of large matrices 

Displays means and standard deviations 

Suppresses all default displayed output except the final summary table 

Displays the cluster to which each variable is assigned during the iterations 
VARCLUS chooses which cluster to split based on the MAXEIGEN= and PROPORTION= options.
If you specify either or both of these two options, then only the specified options affect the choice of the cluster to split.
If you specify neither of these options, the criterion for choice of cluster to split depends on the CENTROID option:
If you specify CENTROID, VARCLUS splits the cluster with the smallest percentage of variation explained by its cluster component, as if you had specified the PROPORTION= option.
If you do not specify CENTROID, VARCLUS splits the cluster with the largest eigenvalue associated with the second principal component, as if you had specified the MAXEIGEN= option.
The final number of clusters is controlled by three options: MAXCLUSTERS=, MAXEIGEN=, and PROPORTION=.
If you specify any of these three options, then only the options you specify affect the final number of clusters.
If you specify none of these options, VARCLUS continues to split clusters until the default splitting criterion is satisfied. The default splitting criterion depends on the CENTROID option:
If you specify CENTROID, the default splitting criterion is PROPORTION=0.75.
If you do not specify CENTROID, splitting is based on the MAXEIGEN= criterion, with a default depending on the COVARIANCE option:
For analyzing a correlation matrix (no COVARIANCE option), the default value for MAXEIGEN= is one.
For analyzing a covariance matrix (using the COVARIANCE option), the default value for MAXEIGEN= is the average variance of the variables being clustered.
VARCLUS continues to split clusters until any of the following conditions holds:
The number of cluster equals the value specified for MAXCLUSTERS=.
No cluster qualifies for splitting according to the MAXEIGEN= or PROPORTION= criterion.
A cluster was chosen for splitting, but after iteratively reassigning variables to clusters, one of the cluster has no members.
The following list gives details about the options.