A decision tree uses
the values of one or more predictor data items to predict the values
of a target data item. A decision tree displays a series of nodes
as a tree, where the top node is the target data item, and each branch
of the tree represents a split in the values of a predictor data item.
Decision trees are also known as classification and regression trees.
Each branch of the tree
displays the name of the predictor for the branch at the top of the
split. The thickness of the branch indicates the number of values
that are associated with each node. The predictor values for each
node are displayed above the node.
Each node in the tree
displays the data for the node either as a histogram (if the target
contains continuous data) or as a bar chart (if the target contains
discrete data). The histogram or bar chart in each node displays the
values of the target data item that are selected by the splits in
the tree. The number at the top right of the node indicates the greatest
value or frequency for the bar chart or histogram. At the bottom of
each node, the total number of data values (count) for the node is
displayed.
Decision trees in SAS
Visual Analytics use a modified version of the C4.5 algorithm.
The details table for
a decision tree contains two additional data columns, Node ID and
Parent ID. Node ID specifies a unique value for each node in the tree.
Parent ID specifies the ID of the parent node.