Example Process Flow Diagram

Task 9. Creating a Multilayer Perceptron Neural Network Model

The attraction of a standard regression model is its simplicity. Unfortunately, the structure of the standard model does not allow for nonlinear associations between the inputs and the target. If such associations exist, both the predicted response probabilities and any interpretations of the modeling results will be inaccurate.

By adding polynomial and interaction terms to the standard model, nonlinear associations can be incorporated into the logistic regression. For more information about this task, see the Regression node section in the Enterpriser Miner online reference documentation. Help [arrow] EM Reference [arrow] Regression Node

You can also use a different modeling tool, such as the Neural Network node. You can use the Neural Network node to fit nonlinear models like a multilayer perceptron (MLP). Neural networks are flexible classification methods that, when carefully tuned, often provide optimal performance in classification problems such as this one. Unfortunately, it is difficult to assess the importance of individual inputs on the classification. For this reason, MLPs have come to be known as "black box" predictive modeling tools. To create an MLP model:

  1. Add a Neural Network node to the Diagram Workspace.

  2. Connect the Transform Variables node to the Neural Network node. (Now, both the Neural Network node and the Regression node should be connected to the Transform Variables node.)

  3. Open the configuration interface to the Neural Network node. The Variables tab lists the input variables and the target. All of the input variables have a status of use, indicating that they will be used to train the network. If you know that an input is not important in predicting the target, you might want to set the status of that variable to don't use (right-click in the Status cell for that variable input, select Set Status, and then select don't use). For this example, all variable inputs will be used to train the network.

  4. The Neural Network node provides a basic (default) and an advanced user interface for configuring the network. The basic interface contains a set of easy-to-use templates for configuring the network. The advanced user interface enables you to have more control over configuring the network. For example, you can change the objective function in the advance interface, but not in the basic interface. If you are not familiar with neural networks, you might want to first experiment with different basic network configurations. To configure the network in the advanced interface, you must first select the Advanced user interface check box in the General tab. For this example, you will use the advanced interface to see a schematic representation of the network.

  5. Because you defined a loss matrix for the GOOD_BAD target, the node automatically sets the model selection criteria to Profit/Loss. The node will select the model that minimizes the expected loss for the cases in the validation data set.

  6. Select the Advance user interface check box. The Advanced tab becomes active and the Basic tab becomes dimmed and unavailable when you select this check box.

    [General tab of the Neural Network: Model Untitled window showing Profit/Loss model selection criteria, Advance User interface selected, and Training Process monitor selected.]

  7. To display a schematic representation of the network, select the Advanced tab.

    [Advanced tab of the Neural Network: Model Untitled window showing schematic of the MLP network in the Network Subtab.]

    The layer on the left represents the input layer that consists of all the interval, nominal, and ordinal inputs. The middle layer is the hidden layer, which has three hidden units (neurons). The layer on the right is the output layer, which corresponds to the target (GOOD_BAD). When you train the Neural Network node, linear combinations of the inputs are transformed by the hidden units and are recombined to form an estimate of the predicted probability of having bad credit.

  8. Save the model by using the File menu to select Save New Model. Type a model name and description and then click OK. The model is added as an entry in the Model Manager. By default, the model is saved as "Untitled."

    [Save Model As window with Model Name Neural and Model Description as Neural Network Model]

  9. Train the model by clicking the Run tool icon at the top of the application.

    Note:   Because you have already run the predecessor nodes in the process flow, you can run the Neural Network node while it is open.  [cautionend]

  10. When the node is running, the Neural Network Monitor window displays the error evolution for the training and validation data sets during optimization.

    [Neural Network Monitor window showing stepwise iterations of training and validation data in the process monitor line graph.]

  11. After the node finishes training, click Yes in the message window to view the results. By default, the node will complete 100 iterations. You can stop training at any time by clicking Stop. Click Continue to continue training. Click Close to stop training altogether and close the monitor.

  12. When the node has completed training, click Yes in the message window to open the Results Browser.

  13. Select the Plot tab of the Neural Network Results Browser. By default, the plot shows the average squared error for each iteration of the training and validation data sets.

    [Plot tab of the Neural Network Results window showing Average Error line graph for Train and Validation data.]

    For this example, the optimal average error was achieved at the 35th iteration. Beyond the 35th iteration, overtraining occurs with respect to the validation data set. The network is being trained to the noise in the training data set instead of the underlying patterns in the data. Note how the training and validation lines diverge beyond the 35th iteration. The message indicator panel at the bottom of the window lists the optimal run, step, iteration, and the average square error for the training and validation data sets.

    Note:   Each time that you open the Neural Network node, a new random seed is created and used to generate starting values for training the network. Therefore, your results may differ slightly from the results that are displayed in this document.  [cautionend]

  14. To view the average loss for each iteration, right-click on the plot and select Loss.

    [Plot tab of the Neural Network Results window showing line graph for Average Loss in each iteration.]

    The average expected loss is minimized at the 35th iteration. To access a dialog box that displays the average loss for this iteration, right-click on the plot, select Enable pop-up info, and then select the vertical, white reference line.

    [Values window showing the value of the selected variables in the plot for a given iteration.]

    The average profit for the cases in the validation data set is 58 cents. Note that the expected loss of -58 cents is adjusted for the prior probabilities that you specified in the prior vector of the target profile for GOOD_BAD.

  15. Close the Neural Network Results Browser and the Neural Network node.

space
Previous Page | Next Page | Top of Page