Using the VARCLUS procedure

Default options for PROC VARCLUS often provide satisfactory results. If you want to change the final number of clusters, use one or more of the MAXCLUSTERS=, MAXEIGEN=, or PROPORTION= options. The MAXEIGEN= and PROPORTION= options usually produce similar results but occasionally cause different clusters to be selected for splitting. The MAXEIGEN= option tends to choose clusters with a large number of variables, while the PROPORTION= option is more likely to select a cluster with a small number of variables.

Execution Time

PROC VARCLUS usually requires more computer time than principal factor analysis, but it can be faster than some of the iterative factoring methods. If you have more than 30 variables, you might want to reduce execution time by one or more of the following methods:

  • Specify the MINCLUSTERS= and MAXCLUSTERS= options if you know how many clusters you want.

  • Specify the HIERARCHY option.

  • Specify the SEED statement if you have some prior knowledge of what clusters to expect.

If computer time is not a limiting factor, you might want to try one of the following methods to obtain a better solution:

  • If the clustering algorithm has not converged, specify larger values for MAXITER= and MAXSEARCH=.

  • Try several factoring and rotation methods with PROC FACTOR to use as input to PROC VARCLUS.

  • Run PROC VARCLUS several times, specifying INITIAL=RANDOM.