Working with Nodes That Model |
About the Tree Desktop Application |
SAS Enterprise Miner Tree Desktop Application is a Microsoft Windows application that implements decision tree methodology for data mining. The application functions in either viewer mode or SAS mode. Viewer mode enables you to interactively browse decision trees that are created with the Enterprise Miner Tree node. SAS mode enables you not only to browse the results, but also to use the software as a complete application by providing automatic and interactive training modes. After you train the tree interactively with the application, you can also view the Java tree results.
Invoke the Application |
In this task, you will use Tree Desktop Application to assess the decision tree model.
Drag a second Decision Tree node from the Model tab on the toolbar into the Diagram Workspace and connect it to the Replacement node.
Note: To organize the diagram layout, right-click the background of the Diagram Workspace and select Layout Horizontally as shown below. Continue to use this feature to organize the layout as the diagram becomes more complex.
Select the second Decision Tree node and set the following Tree node properties in the Properties panel:
Set the Number of Rules to 10. This property controls how many candidate splitting rules are shown during interactive training.
Set the Number of Surrogate Rules to 4.
Click the ellipses button to the right of the Interactive property to invoke the Tree Desktop Application.
Note: If you are asked to update the path before the application is invoked, click .
Right-click the Decision Tree node and select Update in order to ensure that all predecessor nodes have been run.
Click
from the Status window when the path is updated.By default, the root node of the tree diagram is displayed.
Assign Prior Probabilities |
Examine the messages in the lower right-hand corner of the window. Note the message Priors Not Applied. The message indicates that the probabilities that were defined earlier have not been applied to this view of the data.
Select Edit Apply Prior Probabilities from the menu in order to apply the prior probabilities that you defined earlier.
Note that the node counts are now adjusted for the priors. The message panel at the bottom right verifies that the prior probabilities have been applied.
Create the First Interactive Split |
Right-click the root node of the tree diagram and select Split Node.
The Candidate Splitting Rules window opens, displaying the top ten variables, which have been sorted by logworth. In this case, logworth is the negative log of the p-value for the Chi-Square test. Good predictors have higher logworth values.Select the variable FREQUENCY_STATUS_97NK in the Split Node window.
Click
to define the first split.The Tree Diagram shows the first split.
Add Additional Node Statistics |
Right-click the background of the Tree Diagram, and then select Node Statistics.
In the General tab of the Node Statistics window, select the Decision and Predicted Profit boxes and click .
The Decision Tree displays the additional statistics.Note: By default, the width of a line from a parent node to a child node depends on the ratio of the number of cases in the child node compared to the number of cases in the parent node. This distinction is useful when you are examining a much larger tree and you hide the node statistics.
Shade the Nodes by Profit |
You can shade the nodes according to the expected profit value. By default, both the Tree Map and the Tree nodes are shaded according to the proportion of the target event in each node. Lighter shaded nodes indicate a greater frequency of non-donors.
Right-click the background of the Tree and select Tree View Properties.
In the Tree View Properties window, select Profit.
Set Range Maximum to 1.
Set Range Minimum to 0.
Click
.In the Tree window, note that nodes that have higher expected Profit values are shaded darker than nodes that have lower expected Profit values.
Define the Second Split |
Right-click the node that has 7,630 cases and select Split Node.
The Split Node window opens. Candidate rules are always shown for the node that you choose. You control how many rules are displayed prior to invoking the application.
Select PEP_STAR as the splitting variable and then click
.Click
. The Tree appears as follows:Create a Multi-Way Split |
To open the candidate splitting rules for the dark-shaded node (the node that has 3,024 observations), right-click the node and select Split Node.
In the Split Node window, select the variable Replacement: MONTHS_SINCE_LAST_GIFT and then click
.In the MONTHS_SINCE_LAST_GIFT Splitting Rule window, enter 8 in the New split point box and then click .
Select Branch 2 (<8.5) and then click
.In the New split point box, enter 14 as the third split point and then click .
Click
in order to create the modified three-way split.The node that has 3,024 observations is split three ways.
Select View Tree Map from the main menu. The node that has only 46 observations is shaded red in the Tree Map. Recall that the width of the node is proportional to the number of training cases in the node. Nodes that contain few observations cannot be drawn accurately in the space that is allocated to the view. Depending on how your windows are arranged, you might or might not see a red node. Try reducing the size of the Tree Map window in order to see a node that has fewer observations.
Prune a Node from the Tree |
Use interactive mode to prune a tree.
To define a split for any current terminal node, right-click the node. Select Split Node, then select a splitting rule, and click . This example uses the node that has 2,989 observations.
Right-click the parent node for the split that you just defined and select Prune.
Note: You can also define splits and prune nodes from the Tree Map view.
Train the Tree in Automatic Mode |
At this point you can continue to split and prune nodes interactively until you obtain a satisfactory tree, or you can use the algorithm to train the rest of the tree in automatic mode.
To train the tree in automatic mode, right-click a terminal node that has several observations, then select Train. Repeat the process as desired for other nodes.
Other Tree Control Features |
Here are some additional Tree features that you might want to explore:
Use the zoom in/out feature by right-clicking on the background of the Tree and selecting Zoom. You might also want to change the node statistics.
Follow a similar menu path to change the font.
To print the tree on one page or across multiple pages, select File Print.
View the Tree Results |
At this point you are still working in interactive mode.
From the main menu, select Train View Results to change to the results viewer mode. In this task you incorporate validation data into the evaluation of the tree.
From the main menu, select
View | Leaf Statistics Bar Chart | |
and | ||
View | Leaf Statistics Table |
The Leaf Statistics Bar Chart and Leaf Statistics Table windows open.
Right-click inside the Leaf Statistics Bar Chart and select Bar Chart Properties.
Examine the various Bar Chart settings.
Select a node, a subtree, or a variable in one window. Note that the other windows are automatically updated to select the corresponding items.
From the main menu, select View Assessment Plot to display the Assessment plot.
Note: You can select a smaller tree interactively by selecting a smaller number of leaves in this plot. This feature allows you to choose a smaller tree that performs well using both the training and validation data.
View the Tree in the Java Tree Results Viewer of Enterprise Miner |
Close the application and save the changes to the model. The following message is displayed when you close the model in the Tree Desktop Application.
Run the Decision Tree node path from the Diagram Workspace and follow the messages that direct you to view the results.
Close the Tree Results window.
Rename the second Decision Tree in your diagram to indicate that it was the interactive training tree. Right-click the second Decision Tree node, select Rename, and rename the node, Interactive Decision Tree
Click
.
Copyright © 2008 by SAS Institute Inc., Cary, NC, USA. All rights reserved.