The
Data
Partition node enables you to partition your input data
into one of the following data sets:
-
Train —
used for preliminary model fitting. The analyst attempts to find the
best model weights by using this data set.
-
Validation —
used to assess the adequacy of the model in the Model Comparison
node. The validation data set is also used for model fine-tuning
in the
Decision Tree model node to create
the best subtree.
-
Test —
used to obtain a final, unbiased estimate of the generalization error
of the model.
For more information
about the
Data Partition node, see the SAS
Enterprise Miner Help.
Perform the following
steps to add a
Data Partition node to the
analysis:
-
Select the
Sample tab
on the node toolbar and drag a
Data Partition node
into the diagram workspace.
-
Connect the
VAEREXT_SERIOUS input
data node to the
Data Partition node.
Note: To connect one node to another
node in the default horizontal view, position the mouse pointer at
the right edge of a node. A pencil icon appears. Hold the left mouse
button down, and drag the line to the left edge of the node that you
want to connect to, and then release the left mouse button. To change
your view of connected nodes to a vertical view, right-click in the
diagram workspace, and select
Layout Vertically in the menu that
appears.
-
Select the
Data
Partition node to view its properties.
Details about the node
appear in the Properties Panel.
-
Set the Data Set Allocations
properties as follows:
-
Set the
Training property
to
60.0.
-
Set the
Validation property
to
20.0.
-
Set the
Test property
to
20.0.
These data partition
settings ensure adequate data when you build prediction models with
the VAEREXT_SERIOUS data.