The STEPDISC Procedure |
PROC STEPDISC Statement |
The PROC STEPDISC statement invokes the STEPDISC procedure. The options listed in Table 82.1 are available in the PROC STEPDISC statement.
Option |
Description |
---|---|
Input Data Set |
|
specifies input SAS data set |
|
Method Details |
|
specifies maximum macro variable lists |
|
specifies method |
|
specifies singularity |
|
Control Stepwise Selection |
|
specifies entry significance |
|
specifies staying significance |
|
specifies entry partial R square |
|
specifies staying partial R square |
|
forces inclusion of variables |
|
specifies maximum number of steps |
|
specifies variables to begin |
|
specifies number of variables in final model |
|
Control Displayed Output |
|
displays all |
|
displays between correlations |
|
displays between covariances |
|
displays between SSCPs |
|
displays pooled correlations |
|
displays pooled covariances |
|
displays pooled SSCPs |
|
suppresses output |
|
displays descriptive statistics |
|
displays standardized class means |
|
displays total correlations |
|
displays total covariances |
|
displays total SSCPs |
|
displays within correlations |
|
displays within covariances |
|
displays within SSCPs |
displays between-class covariances. The between-class covariance matrix equals the between-class SSCP matrix divided by , where is the number of observations and is the number of classes. The between-class covariances should be interpreted in comparison with the total-sample and within-class covariances, not as formal estimates of population parameters.
specifies the data set to be analyzed. The data set can be an ordinary SAS data set or one of several specially structured data sets created by statistical procedures available with SAS/STAT software. These specially structured data sets include TYPE=CORR, COV, CSSCP, and SSCP. If the DATA= option is omitted, the procedure uses the most recently created SAS data set.
includes the first variables in the VAR statement in every model. By default, INCLUDE=0.
specifies the maximum number of macro variables with independent variable lists to create. By default, MAXMACRO=100. PROC STEPDISC saves the list of selected variables in a macro variable, &_StdVar. Suppose your input variable list consists of x1-x10; then &_StdVar would be set to x1 x3 x4 x10 if, for example, the first, third, fourth, and tenth variables were selected for the model. This list can be used, for example, in a subsequent procedure’s VAR statement as follows:
var &_stdvar;
With BY processing, one macro variable is created for each BY group, and the macro variables are indexed by the BY-group number. The MAXMACRO= option can be used to either limit or increase the number of these macro variables in processing data sets with many BY groups. The macro variables are created as follows:
With no BY processing, PROC STEPDISC creates the following: |
|
_StdVar |
selected variables |
_StdVar1 |
selected variables |
_StdNumBys |
number of BY groups (1) |
_StdNumMacroBys |
number of _StdVar macro variables actually made (1) |
With BY processing, PROC STEPDISC creates the following: |
|
_StdVar |
selected variables for BY group 1 |
_StdVar1 |
selected variables for BY group 1 |
_StdVar2 |
selected variables for BY group 2 |
. |
|
. |
|
. |
|
_StdVar |
selected variables for BY group , where a number is substituted for |
_StdNumBys |
, the number of BY groups |
_StdNumMacroBys |
the number of _StdVar macro variables actually made. This value might be less than _StdNumbys = , and it is less than or equal to the MAXMACRO= value. |
specifies the maximum number of steps. By default, MAXSTEP= two times the number of variables in the VAR statement.
specifies the method used to select the variables in the model. The BACKWARD method specifies backward elimination, FORWARD specifies forward selection, and STEPWISE specifies stepwise selection. By default, METHOD=STEPWISE.
displays pooled within-class correlations (partial correlations based on the pooled within-class covariances).
specifies the partial R square for adding variables in the forward selection mode, where .
specifies the partial R square for retaining variables in the backward elimination mode, where .
displays simple descriptive statistics for the total sample and within each class.
specifies the singularity criterion for entering variables, where . PROC STEPDISC precludes the entry of a variable if the squared multiple correlation of the variable with the variables already in the model exceeds . With more than one variable already in the model, PROC STEPDISC also excludes a variable if it would cause any of the variables already in the model to have a squared multiple correlation (with the entering variable and the other variables in the model) exceeding . By default, SINGULAR= 1E–8.
specifies the significance level for adding variables in the forward selection mode, where . The default value is 0.15.
specifies the significance level for retaining variables in the backward elimination mode, where . The default value is 0.15.
specifies that the first variables in the VAR statement be used to begin the selection process. When you specify METHOD=FORWARD or METHOD=STEPWISE, the default value is 0; when you specify METHOD=BACKWARD, the default value is the number of variables in the VAR statement.
displays total-sample and pooled within-class standardized class means.
specifies the number of variables in the final model. The STEPDISC procedure stops the selection process when a model with n variables is found. This option applies only when you specify METHOD=FORWARD or METHOD=BACKWARD. When you specify METHOD=FORWARD, the default value is the number of variables in the VAR statement; when you specify METHOD=BACKWARD, the default value is 0.
displays the within-class corrected SSCP matrix for each class level.
Copyright © 2009 by SAS Institute Inc., Cary, NC, USA. All rights reserved.