Use the Association
node to identify association and sequence relationships within the
data. For example, “If a customer buys cat food, how likely
is the customer to also buy cat litter?” In the case of sequence
discovery, this question could be extended and posed as, “If
a customer buys cat food today, how likely is the customer to buy
cat litter within the next week?”
|
|
Use the Cluster node
to perform observation clustering, which can be used to segment databases.
Clustering places objects into groups or clusters suggested by the
data. The objects in each cluster tend to be similar to each other
in some sense, and objects in different clusters tend to be dissimilar.
|
|
The DMDB node creates
a data mining database that provides summary statistics and factor-level
information for class and interval variables in the imported data
set. Improvements to SAS Enterprise Miner have eliminated the previous
need to use the DMDB node to optimize the performance of nodes. However,
the DMDB database can still provide quick summary statistics for class
and interval variables at a given point in a process flow diagram.
|
|
The Graph Explore node
is an advanced visualization tool that enables you to interactively
explore large volumes of data to uncover patterns and trends and to
reveal extreme values in the database. You can analyze univariate
distributions, investigate multivariate distributions, create scatter
and box plots, constellation and 3-D charts, and so on.
|
|
The Market Basket node
performs association rule mining over transaction data in conjunction
with item taxonomy. Market basket analysis uses the information from
the transaction data to give you insight,forexample,aboutwhichproductstendtobepurchasedtogether. The market basket analysis
is not limited to the retail
marketing domain and can be abstracted to other areas such as word
co-occurrence relationships in text documents.
|
|
Use the StatExplore
node to examine the statistical properties of an input data set. You
can use the StatExplore node to compute standard univariate distribution
statistics, to compute standard bivariate statistics by class target
and class segment, and to compute correlation statistics for interval
variables by interval input and target.
|
|
Text Miner1
|
|
Variable clustering
is a useful tool for data reduction and can remove collinearity, decrease
variable redundancy, and help to reveal the underlying structure of
the input variables in a data set. When properly used as a variable-reduction
tool, the Variable Clustering node can replace a large set of variables
with the set of cluster components with little loss of information.
|
|
1The Text Miner node is an add-on for SAS Enterprise Miner and therefore does not appear in all installed versions. For more information about this node, see the SAS Text Miner documentation. |