The
Pruning property of the
decision tree visualization determines how aggressively your decision
tree is pruned. The growth algorithm creates a decision tree based on the properties that
you specify. The
pruning algorithm considers each
node to be a
root node of its own
subtree, starting from the bottom. If the
misclassification rate of the subtree is significantly better than the
misclassification rate of the
root node, then the subtree is kept. If the misclassification rate of the subtree is similar
to the misclassification rate of the root node, then the subtree is pruned. In general,
smaller
decision trees are preferred.
If the
Pruning property
slider is closer to
Lenient, then the difference in the
misclassification rates must be relatively small. If the
Pruning property
is closer to
Aggressive, then the difference in the misclassification rates must be relatively large. That
is, a lenient pruning algorithm allows the decision tree to grow much deeper than
an aggressive pruning algorithm.
Variables that are not used in any split can still affect the decision tree, typically
due to one of two reasons. It is possible for a variable to be used in
a split, but the subtree that contained that split might have been pruned. Alternatively,
the variable might
include
missing values, but the
Include missing property is disabled.
Note: If a predictor does not contribute
to the predictive accuracy of the decision tree or the contribution
is too small, then it is not included in the final, displayed decision
tree.