SUPPORT / SAMPLES & SAS NOTES
 

Support

Sample 50096: Coloring the clusters in a dendrogram

DetailsResultsDownloadsAboutRate It

Generate colored clusters in a dendrogram

Contents: Purpose / History / Requirements / Usage / Limitations
PURPOSE:
The %CLUSTERGROUPS macro enhances dendrograms produced in SAS by adding color to highlight the clusters. You specify the number of clusters desired as input to the macro.
HISTORY:
1.0Initial version.
REQUIREMENTS:
Version 9.3 or later of Base SAS software. However, SAS/STAT software (PROC CLUSTER) is generally needed to produce the input data set needed by the macro.
USAGE:
Follow the instructions in the Downloads tab of this sample to save the %CLUSTERGROUPS macro definition. Replace the text within quotation marks in the following statement with the location of the %CLUSTERGROUPS macro definition file on your system. In your SAS program or in the SAS editor window, specify this statement to define the %CLUSTERGROUPS macro and make it available for use:

%inc "<location of your file containing the CLUSTERGROUPS macro>";

Following this statement, you can call the %CLUSTERGROUPS macro. Before running the %CLUSTERGROUPS macro, create the DATA= data set for input to the macro. The easiest way to generate the necessary input data set is to use the OUTTREE= data set from performing a cluster analysis in PROC CLUSTER. The example in the Results tab uses this technique.

The following macro parameter is required:

NCLUSTERS
Number of clusters

The following macro parameters are optional:

DATA=
SAS data set that can be used by the TREE procedure, typically created with the OUTTREE= option in PROC CLUSTER. By default, the macro uses the most recently created SAS data set. The required variables in the DATA= data set are as follows:
  • an ID variable
  • NAME_, a character variable giving the name of the node
  • _PARENT_, a character variable giving the value of _NAME_ of the parent of the node
  • _HEIGHT_, the distance or similarity between the last clusters joined
ID=
ID variable from PROC CLUSTER. By default, the macro chooses an ID variable if you do not specify one.

The version of the %CLUSTERGROUPS macro that you are using is displayed in the SAS log when you specify version (or any string) as the first argument. For example:

    %CLUSTERGROUPS(version, ...other options...)

The %CLUSTERGROUPS macro attempts to check for a later version of itself. If it is unable to do this (such as if there is no active internet connection available), the macro will issue the following message:

   CLUSTERGROUPS: Unable to check for newer version

The computations performed by the macro are not affected by the appearance of this message.

LIMITATIONS:
Very little error checking is done. The macro assumes the input data set has the required variables.

This macro cannot be used to analyze an input data set containing multiple BY groups. BY groups processing is not supported.




These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.