In this
example, the variables SES and URBANICITY are class variables for
which the value
?
denotes a missing
value. Because a question mark does not denote a missing value in
the terms that SAS defines a missing value (that is, a blank or a
period), SAS Enterprise Miner sees it as an additional level of a
class variable. However, the knowledge that these values are missing
will be useful later in the model-building process.
To use
the Replacement node to interactively specify that such observations of these
variables are missing, complete the following steps:
-
Select
the
Modify tab on the Toolbar.
-
Select
the Replacement node icon. Drag the node into the Diagram Workspace.
-
Connect
the Data Partition node to the Replacement node.
-
Select
the Replacement node. In the Properties Panel, scroll down to view
the Train properties.
-
For interval
variables, click on the value of
Default Limits Method, and select
None from the drop-down menu
that appears. This selection indicates that no values of interval
variables should be replaced. With the default selection, a particular
range for the values of each interval variable would have been enforced.
In this example, you do not want to enforce such a range.
Note: In this data
set, all missing interval variable values are correctly coded as SAS
missing values (a blank or a period).
-
For class
variables, click on the ellipses that represent the value of
Replacement Editor. The Replacement Editor opens.
-
Notice that SES and URBANICITY
both have a level that contains observations with the value
?
. In the case of these two variables, this level
represents observations with missing values. Enter
_MISSING_
as the
Replacement Value for the two rows. This action enables SAS Enterprise Miner to see
that the question marks indicate missing values for these two variables.
Later, you will impute values for observations with missing values.
-
Enter
_UNKNOWN_
as the
Replacement Value for the level of
DONOR_GENDER that has the value
A
.
This value is the result of a data entry error, and you do not know
whether the intention was to code it as an
F
or an
M
.
-
In the
Diagram Workspace, right-click the Replacement node, and select
Run from the resulting menu. Click
Yes in the confirmation window that opens.
-
In the
window that appears when processing completes, click
OK.
Note: In the data
that is exported from the Replacement node, a new variable is created
for each variable that is replaced (in this example, SES, URBANICITY,
and DONOR_GENDER). The original variable is not overwritten. Instead,
the new variable has the same name as the original variable but is
prefaced with REP_. The original version of each variable also exists
in the exported data and has the role
Rejected
.
Tip
To view the
data that is exported by a node, click the ellipses that represent
the value of the General property
Exported Data in the Properties Panel. To view the exported variables, click
Properties in the window that opens, and then view the
Variables tab. Similarly, you can view the data that
is imported and used by a node by clicking the ellipses that represent
the value of the General property
Imported Data in the Properties Panel.