PRUNE
by-metric </ until-metric operator value> ;
The PRUNE statement controls pruning. It has three different syntaxes: one for C4.5-style pruning, one for no pruning, and
one for pruning by using a specified metric.
The default pruning method is entropy. The following PRUNE statement example is equivalent to having no PRUNE statement:
prune entropy;
The preceding statement is also equivalent to the following statement:
prune entropy / entropy >= 1.0;
You can specify the following pruning options:
-
C45 </ confidence>
-
requests C4.5-based pruning (Quinlan, 1993) based on the upper error rate from the binomial distribution (Wilson, 1927; Blyth and Still, 1983; Agresti and Coull, 1998) at the confidence limit. The default confidence is 0.25.
-
NONE
-
turns off pruning.
-
by-metric < / until-metric operator value>
-
chooses a node to prune back to a leaf by the specified by-metric. Optionally, you can specify an until-metric, operator, and value to control pruning. If you do not specify these arguments, until-metric is set to the same metric as by-metric, operator is set to “>=,” and value is set to 1. You can specify any of the following values for by-metric:
- ASE
-
chooses the leaf that has the smallest change in the average square error.
- ENTROPY
-
chooses the leaf that has the smallest change in the entropy.
- GINI
-
chooses the leaf that has the smallest change in the Gini statistic.
- MISC
-
chooses the leaf that has the smallest change in the misclassification rate.
You can specify any of the following values for until-metric:
- ASE
-
stops pruning when the per-leaf change in average square error rate is operator value times the per-leaf change in the ASE of pruning the whole initial tree to a leaf.
- ENTROPY
-
stops pruning when the per-leaf change in entropy is operator value times the per-leaf change in the entropy of pruning the whole initial tree to a leaf.
- GINI
-
stops pruning when the per-leaf change in the Gini statistic is operator value times the per-leaf change in the Gini statistic of pruning the whole initial tree to a leaf.
- MISC
-
stops pruning when the per-leaf change in misclassification rate is operator value times the per-leaf change in the misclassification rate of pruning the whole initial tree to a leaf.
- N
-
stops pruning when the number of leaves is operator value.
You can specify any of the following values for operator:
- <=
-
less than or equal to
- LE
-
less than or equal to
- >=
-
greater than or equal to
- GE
-
greater than or equal to
- <
-
less than
- LT
-
less than
- >
-
greater than
- GT
-
greater than
- =
-
equal to
- EQ
-
equal to
Copyright © SAS Institute Inc. All Rights Reserved.