Neural networks are
a class of parametric models that can accommodate a wider variety
of nonlinear relationships between a set of predictors and a target
variable than can logistic regression. Building a neural network model
involves two main phases. First, you must define the network configuration.
You can think of this step as defining the structure of the model
that you want to use. Then, you iteratively train the model.
A neural network model
will be more complicated to explain to the management of your organization
than a regression or a decision tree. However, you know that the management
would prefer a stronger predictive model, even if it is more complicated.
So, you decide to run a neural network model, which you will compare
to the other models later in the example.
Because neural networks
are so flexible, SAS Enterprise Miner has two nodes that fit neural
network models: the Neural Network node and the AutoNeural node. The
Neural Network node trains a specific neural network configuration;
this node is best used when you know a lot about the structure of
the model that you want to define. The AutoNeural node searches over
several network configurations to find one that best describes the
relationship in a data set and then trains that network.
This example does not
use the AutoNeural node. However, you are encouraged to explore the
features of this node on your own.
Before creating a neural
network, you will reduce the number of input variables with the Variable
Selection node. Performing variable selection reduces the number of
input variables and saves computer resources. To use the Variable
Selection node to reduce the number of input variables that are used
in a neural network:
-
Select the
Explore tab on the Toolbar.
-
Select the Variable
Selection node icon. Drag the node into the Diagram Workspace.
-
Connect the Transform
Variables node to the Variable Selection node.
-
In the Diagram Workspace,
right-click the Variable Selection node, and select
Run from the resulting menu. Click
Yes in the
confirmation window that opens.
-
In the window that appears
when processing completes, click
Results. The
Results window appears.
-
Expand the
Variable Selection window.
Examine the table to
see which variables were selected. The role for variables that were
not selected has been changed to
Rejected
. Close the
Results window.
Note: In this example, for variable
selection, a forward stepwise least squares regression method was
used that maximizes the model R-square value. For more information
about this method, see the SAS Enterprise Miner Help.
-
Close the
Results window.
The input data is now
ready to be modeled with a neural network. To use the Neural Network
node to train a specific neural network configuration:
-
From the
Model tab on the Toolbar, select the Neural Network
node icon. Drag the node into the Diagram Workspace.
-
Connect the Variable
Selection node to the Neural Network node.
-
Select the Neural Network
node. In the Properties Panel, scroll down to view the Train properties,
and click on the ellipses that represent the value of
Network. The
Network window appears. For more information
about neural networks, connections, and hidden units, see the Neural
Network Node: Reference documentation in SAS Enterprise Miner help.
-
Click on the value of
Direct Connection and select
Yes from the drop-down menu that appears. This selection enables the
network to have connections directly between the inputs and the outputs
in addition to connections via the hidden units.
-
Click on the value of
Number of Hidden Units and enter
5
. This example trains a multilayer perceptron neural network with
five units on the hidden layer.
-
-
In the Diagram Workspace,
right-click the Neural Network node, and select
Run from the resulting menu. Click
Yes in the
confirmation window that opens.
-
In the window that appears
when processing completes, click
Results. The
Results window appears. Maximize the
Score
Rankings Overlay window. From the drop-down menu, select
Cumulative Total Expected Profit.
Notice that the plot
from this model has a different shape than the plot from the logistic
regression model. This plot seems to suggest that you would get better
results by mailing just the top 40% of your candidates.
-
Close the
Results window.