The PROC MODECLUS statement invokes the MODECLUS procedure. Table 66.1 summarizes the options available in the PROC MODECLUS statement. These options are discussed in the following sections.
Table 66.1: Summary of PROC MODECLUS Statement Options
Option 
Description 

Specify input and output data sets 

Specifies input data set name 

Specifies output data set name for observations 

Specifies output data set name for clusters 

Specifies output data set name for cluster solutions 

Specify variables in output data sets 

Specifies variable in the OUT= and OUTCLUS= data sets identifying clusters 

Specifies variable in the OUT= data set containing density estimates 

Specifies length of variables in the output data sets 

Summarize and process coordinate data before clustering 

Requests simple statistics 

Standardizes the variables to mean 0 and standard deviation 1 

Specify smoothing parameters 

Specifies number of neighbors to use for kthnearestneighbor density estimation 

Specifies number of neighbors to use for clustering 

Specifies number of neighbors to use for kthnearestneighbor density estimation and clustering 

Specifies radius of the sphere of support for uniformkernel density estimation 

Specifies radius of the neighborhood for clustering 

Specifies radius of the sphere of support for uniformkernel density estimation and the neighborhood clustering 

Specify density estimation options 

Specifies number of times the density estimates are to be cascaded 

Specifies dimensionality to be used when computing density estimates 

Uses arithmetic means for cascading density estimates 

Uses harmonic means for cascading density estimates 

Uses sums for cascading density estimates 

Specify clustering methods and options 

Dissolves clusters with n or fewer members 

Stops the analysis after obtaining a solution with either no cluster or a single cluster 

Requests that nonsignificant clusters be hierarchically joined 

Specifies maximum number of clusters to be obtained with METHOD=6 

Specifies clustering method to use 

Specifies minimum members for either cluster to be designated a modal cluster when two clusters are joined using METHOD=5 

Specifies power of the density used with METHOD=6 

Specifies approximate significance tests for the number of clusters 

Specifies assignment threshold used with METHOD=6 

Specify the output display options 

Produces all optional output 

Displays the density and cluster membership of observations with neighbors belonging to a different cluster 

Retains the neighbor lists for each observation in memory 

Displays the estimated cross validated log density of each observation 

Displays the estimated density and cluster membership of each observation 

Displays estimates of local dimensionality and writes them to the OUT=data set 

Displays the neighbors of each observation 

Suppresses the display of the output 

Suppresses the display of the summary of the number of clusters, number of unassigned observations, and maximum pvalue for each analysis 

Suppresses the display of statistics for each cluster 

Traces the cluster assignments when METHOD=6 
You can specify at least one of the following options for smoothing parameters for density estimation: DK=, K=, DR=, or R=. To obtain a cluster analysis, you can specify the METHOD= option and at least one of the following smoothing parameters for clustering: CK=, K=, CR=, or R=. If you want significance tests for the number of clusters, you should specify either the DR= or R= option. If none of the smoothing parameters is specified, the MODECLUS procedure provides a default value for the R= option. See the section Density Estimation for the formula of a reasonable first guess for R= and a discussion of smoothing parameters.
You can specify lists of values for the DK=, CK=, K=, DR=, CR=, and R= options. Numbers in the lists can be separated by blanks or commas. You can include in the lists one or more items of the form start TO stop BY increment. Each list can contain either one value or the same number of values as in every other list that contains more than one value. If a list has only one value, that value is used in combination with all the values in longer lists. If two or more lists have more than one value, then one analysis is done by using the first value in each list, another analysis is done by using the second value in each list, and so on.
You can specify the following options in the PROC MODECLUS statement.