Working with Nodes That Sample, Explore, and Modify |
You use the Replacement node to generate score code to process unknown variable levels when you are scoring data, and to interactively specify replacement values for class levels.
In this task, you add and configure a Replacement node in your process flow diagram.
From the Modify tab of the node toolbar, drag a Replacement node into the Diagram Workspace and connect it to the Data Partition node.
Select the Data Partition node. On the Properties panel, select the ellipsis button to the right of the Variables property to explore any of the variables in the input data set. The Variables window opens.
In the Variables window, sort by level and then select the variables SES and URBANICITY, and then click
. The Explore window opens.Note: If is dimmed and unavailable, right-click the Data Partition node and select Run.
In the Explore window, notice that both the SES and URBANICITY variables contain observations that have missing values. The observations are represented by question marks. Later, you will use the Impute node to replace the missing values with imputed values that have more predictive power.
Double click the bar that corresponds to missing values (SES = "?") in the SES histogram. Notice that when observations display missing values for the variable SES, the observations also display missing values for the variable URBANICITY. The graphs interact with one another.
Close the Explore window.
Click
to close the Variables window.In the Replacement node Properties panel, select the ellipsis button to the right of the Class Variables Replacement Editor property.
The Replacement Editor window opens.
Note: By default, Enterprise Miner replaces unknown levels using the Unknown Levels property in the Properties panel. The choices are Ignore, Missing and Mode (the most frequent value). Ensure that the Unknown Level property is set to Ignore.
Scroll through the data table in the Replacement Editor window. Observe the values for the variable levels of SES and URBANICITY. When one of these variable levels displays a question mark (?) in the Char Raw value column, enter _MISSING_ in the Replacement Value column for that row. This will cause the Replacement node to replace the variable value with a SAS missing value notation.
Click
.Right-click the Replacement node and select Run.
Copyright © 2008 by SAS Institute Inc., Cary, NC, USA. All rights reserved.