Working with Nodes That Assess |
In this task, you use the Model Comparison node to benchmark model performance and find a champion model among the Regression, Neural Network, AutoNeural, and Decision Tree nodes in your process flow diagram. The Model Comparison node enables you to judge the generalization properties of each predictive model based on their predictive power, lift, sensitivity, profit or loss, and so on.
Drag a Model Comparison node from the Assess tab of the node toolbar into the Diagram Workspace. Connect the Model Comparison node to the Regression, Decision Tree, AutoNeural, and Neural Network nodes as shown below.
Right-click the Model Comparison node and select Run. A Confirmation window appears. Click .
Note: Running the process flow diagram might take several minutes.
Click
when the process flow diagram run is complete. The Results window opens.The Results window displays the following information for a binary target:
Receiver Operating Characteristics (ROC) charts. The charts overlay the competing models for both the training and validation data (this example does not create a test data set). Each point on the ROC curve represents a cutoff probability. Points closer to the upper-right corner correspond to low cutoff probabilities. Points closer to the lower-left corner correspond to higher cutoff probabilities. The performance quality of a model is indicated by the degree that the ROC curve pushes upward and to the left. This degree can be quantified as the area under the ROC curve. The area under the ROC curve, or ROC Index, is summarized in the Output window of the Model Comparison node.
A Score Rankings chart. For a binary target, all observations in the scored data set are sorted by the posterior probabilities of the event level in descending order for each model.
A detailed listing of model diagnostics. The list is provided in the Output window. In this example, the Neural model is marked Y as the selected model, because the Neural model maximizes the average profit when applied to the validation data. Maximizing average profit is the default criterion for choosing the best model when a profit matrix is defined and when validation data is available. The scoring formula for this model will automatically be passed to the successor Score node for scoring new data.
Note: You can also use the Fit Statistics window to determine the champion model. The champion model displays a Y in the Selected Model column of the Fit Statistics window.
You can change the vertical axis Change the vertical axis statistic on the Score Rankings Plot to display the profit. Right-click the background of the Score Rankings plot and select Data Options.
In the Data Options Dialog window, scroll down the list of variables until you see the variable Profit. Change the Role of Profit to Y.
Click
.Note: The drop-down box still reads Cumulative Lift even though the graph now displays Profit. This is because the list of variables that populate the drop-down list does not include Profit. When you use the Data Options dialog to change the vertical axis of the plot to a variable that is not in the drop-down list, the drop-down box displays the default value or the last value that was selected.
Close the Results window.
Copyright © 2008 by SAS Institute Inc., Cary, NC, USA. All rights reserved.