Previous Page | Next Page

Working with Nodes That Model

Create an Interactive Decision Tree


About the Tree Desktop Application

SAS Enterprise Miner Tree Desktop Application is a Microsoft Windows application that implements decision tree methodology for data mining. The application functions in either viewer mode or SAS mode. Viewer mode enables you to interactively browse decision trees that are created with the Enterprise Miner Tree node. SAS mode enables you not only to browse the results, but also to use the software as a complete application by providing automatic and interactive training modes. After you train the tree interactively with the application, you can also view the Java tree results.


Invoke the Application

In this task, you will use Tree Desktop Application to assess the decision tree model.

  1. Drag a second Decision Tree node from the Model tab on the toolbar into the Diagram Workspace and connect it to the Replacement node.

    [untitled graphic]

    Note:   To organize the diagram layout, right-click the background of the Diagram Workspace and select Layout  [arrow]  Horizontally as shown below. Continue to use this feature to organize the layout as the diagram becomes more complex.  [cautionend]

    [untitled graphic]

  2. Select the second Decision Tree node and set the following Tree node properties in the Properties panel:

    • Set the Number of Rules to 10. This property controls how many candidate splitting rules are shown during interactive training.

    • Set the Number of Surrogate Rules to 4.

    [untitled graphic]

  3. Click the ellipses button to the right of the Interactive property to invoke the Tree Desktop Application.

    [untitled graphic]

    Note:   If you are asked to update the path before the application is invoked, click OK.  [cautionend]

  4. Right-click the Decision Tree node and select Update in order to ensure that all predecessor nodes have been run.

    [untitled graphic]

  5. Click OK from the Status window when the path is updated.

    By default, the root node of the tree diagram is displayed.

    [untitled graphic]


Assign Prior Probabilities

  1. Examine the messages in the lower right-hand corner of the window. Note the message Priors Not Applied. The message indicates that the probabilities that were defined earlier have not been applied to this view of the data.

  2. Select Edit  [arrow]  Apply Prior Probabilities from the menu in order to apply the prior probabilities that you defined earlier.

    [untitled graphic]

    Note that the node counts are now adjusted for the priors. The message panel at the bottom right verifies that the prior probabilities have been applied.

    [untitled graphic]


Create the First Interactive Split

  1. Right-click the root node of the tree diagram and select Split Node.

    [untitled graphic]

    The Candidate Splitting Rules window opens, displaying the top ten variables, which have been sorted by logworth. In this case, logworth is the negative log of the p-value for the Chi-Square test. Good predictors have higher logworth values.
  2. Select the variable FREQUENCY_STATUS_97NK in the Split Node window.

    [untitled graphic]

  3. Click OK to define the first split.

    The Tree Diagram shows the first split.

    [untitled graphic]


Add Additional Node Statistics

  1. Right-click the background of the Tree Diagram, and then select Node Statistics.

    [untitled graphic]

  2. In the General tab of the Node Statistics window, select the Decision and Predicted Profit boxes and click OK.

    [Node Statistics Window]

    The Decision Tree displays the additional statistics.

    [untitled graphic]

    Note:   By default, the width of a line from a parent node to a child node depends on the ratio of the number of cases in the child node compared to the number of cases in the parent node. This distinction is useful when you are examining a much larger tree and you hide the node statistics.  [cautionend]


Shade the Nodes by Profit

You can shade the nodes according to the expected profit value. By default, both the Tree Map and the Tree nodes are shaded according to the proportion of the target event in each node. Lighter shaded nodes indicate a greater frequency of non-donors.

  1. Right-click the background of the Tree and select Tree View Properties.

    [Tree View Properties Menu Item]

  2. In the Tree View Properties window, select Profit.

    [untitled graphic]

  3. Set Range Maximum to 1.

  4. Set Range Minimum to 0.

  5. Click OK.

    In the Tree window, note that nodes that have higher expected Profit values are shaded darker than nodes that have lower expected Profit values.

    [Tree Window]


Define the Second Split

  1. Right-click the node that has 7,630 cases and select Split Node.

    [Split Node Menu Item]

    The Split Node window opens. Candidate rules are always shown for the node that you choose. You control how many rules are displayed prior to invoking the application.

  2. Select PEP_STAR as the splitting variable and then click Apply.

    [untitled graphic]

  3. Click OK. The Tree appears as follows:

    [untitled graphic]


Create a Multi-Way Split

  1. To open the candidate splitting rules for the dark-shaded node (the node that has 3,024 observations), right-click the node and select Split Node.

    [Split Node Menu Item]

  2. In the Split Node window, select the variable Replacement: MONTHS_SINCE_LAST_GIFT and then click Edit Rule.

    [untitled graphic]

  3. In the MONTHS_SINCE_LAST_GIFT Splitting Rule window, enter 8 in the New split point box and then click Add Branch.

    [untitled graphic]

  4. Select Branch 2 (<8.5) and then click Remove Branch.

    [untitled graphic]

  5. In the New split point box, enter 14 as the third split point and then click Add Branch.

    [untitled graphic]

  6. Click OK in order to create the modified three-way split.

    [untitled graphic]

    The node that has 3,024 observations is split three ways.

    [untitled graphic]

  7. Select View [arrow] Tree Map from the main menu. The node that has only 46 observations is shaded red in the Tree Map. Recall that the width of the node is proportional to the number of training cases in the node. Nodes that contain few observations cannot be drawn accurately in the space that is allocated to the view. Depending on how your windows are arranged, you might or might not see a red node. Try reducing the size of the Tree Map window in order to see a node that has fewer observations.

    [Tree Map Window]


Prune a Node from the Tree

Use interactive mode to prune a tree.

  1. To define a split for any current terminal node, right-click the node. Select Split Node, then select a splitting rule, and click OK. This example uses the node that has 2,989 observations.

    [untitled graphic]

  2. Right-click the parent node for the split that you just defined and select Prune.

    [untitled graphic]

    Note:   You can also define splits and prune nodes from the Tree Map view.

      [cautionend]


Train the Tree in Automatic Mode

At this point you can continue to split and prune nodes interactively until you obtain a satisfactory tree, or you can use the algorithm to train the rest of the tree in automatic mode.

  1. To train the tree in automatic mode, right-click a terminal node that has several observations, then select Train. Repeat the process as desired for other nodes.

    [untitled graphic]


Other Tree Control Features

Here are some additional Tree features that you might want to explore:


View the Tree Results

At this point you are still working in interactive mode.

  1. From the main menu, select Train  [arrow]  View Results to change to the results viewer mode. In this task you incorporate validation data into the evaluation of the tree.

    [untitled graphic]

  2. From the main menu, select

    View
    [arrow]
    Leaf Statistics Bar Chart
    and
    View
    [arrow]
    Leaf Statistics Table

    to open the Leaf Statistics Bar Chart and Table.

    [Leaf Statistics Menu Item]

    The Leaf Statistics Bar Chart and Leaf Statistics Table windows open.

    [untitled graphic]

  3. Right-click inside the Leaf Statistics Bar Chart and select Bar Chart Properties.

    [untitled graphic]

  4. Examine the various Bar Chart settings.

    [Leaf Statistics Bar Chart Propertiew Window]

  5. Select a node, a subtree, or a variable in one window. Note that the other windows are automatically updated to select the corresponding items.

    [untitled graphic]

  6. From the main menu, select View  [arrow]  Assessment Plot to display the Assessment plot.

    [untitled graphic]

    Note:   You can select a smaller tree interactively by selecting a smaller number of leaves in this plot. This feature allows you to choose a smaller tree that performs well using both the training and validation data.  [cautionend]


View the Tree in the Java Tree Results Viewer of Enterprise Miner

  1. Close the application and save the changes to the model. The following message is displayed when you close the model in the Tree Desktop Application.

    [Update Enterprise Miner Window]

  2. Run the Decision Tree node path from the Diagram Workspace and follow the messages that direct you to view the results.

    [untitled graphic]

  3. Close the Tree Results window.

  4. Rename the second Decision Tree in your diagram to indicate that it was the interactive training tree. Right-click the second Decision Tree node, select Rename, and rename the node, Interactive Decision Tree

    [untitled graphic]

    [Rename Window]

  5. Click OK.

    [untitled graphic]

Previous Page | Next Page | Top of Page