PROC GENESELECT: SAVE Statement

The GENESELECT Procedure

SAVE <options> ;

The SAVE statement outputs model information into SAS data sets.

DISSIMILARITY=SAS-data-set

names the output data set to contain a dissimilarity statistic for pairs of input variables. The data set has the type DISTANCE and is suitable for input to the DATA= option of the CLUSTER procedure. The data set includes an ID variable, _VAR_. The dissimilarity matrix equals one minus the similarity matrix output in the SIMILARITY= option. Similarity relies on surrogate rules. Use the MAXSURROGATES= option in the PROC statement to create surrogate rules when the model is fit.

IMPORTANCE=SAS-data-set: names the output data set to contain the split-based variable importance.
MODEL=SAS-data-set: names the output data set to encode the information necessary for use with the INMODEL= option in a subsequent invocation of the GENESELECT procedure.
SIMILARITY=SAS-data-set: names the output data set to contain a similarity statistic for pairs of input variables. The data set contains a variable for every input variable used in a primary splitting rule, and an additional identification variable, _VAR_, whose value is the name of an input variable. Similarity relies on surrogate rules. Use the MAXSURROGATES= option in the PROC statement to create surrogate rules when the model is fit. The similarity matrix equals one minus the dissimilarity matrix that is created by using the DISSIMILARITY= option. The DISSIMILARITY= option creates a DISTANCE matrix suitable for input to the CLUSTER procedure.

Note: This procedure is experimental.

Top of Page