The
AutoNeural node
can be used to automatically configure a neural network. The
AutoNeural node
implements a search algorithm to incrementally select activation functions
for a variety of multilayer networks.
The
Decision
Tree node enables you to fit decision tree models to
your data. The implementation includes features that are found in
a variety of popular decision tree algorithms (for example, CHAID,
CART, and C4.5). The node supports both automatic and interactive
training. When you run the Decision Tree node in automatic mode, it
automatically ranks the input variables based on the strength of their
contribution to the tree. This ranking can be used to select variables
for use in subsequent modeling. You can override any automatic step
with the option to define a splitting rule and prune explicit tools
or subtrees. Interactive training enables you to explore and evaluate
data splits as you develop them.
The
Dmine
Regression node enables you to compute a forward stepwise,
least squares regression model. In each step, the independent variable
that contributes maximally to the model R-square value is selected.
The tool can also automatically bin continuous terms.
The
DMNeural node
is another modeling node that you can use to fit an additive nonlinear
model. The additive nonlinear model uses bucketed principal components
as inputs to predict a binary or an interval target variable with
automatic selection of an activation function.
The
Ensemble node
enables you to create new models by combining the posterior probabilities
(for class targets) or the predicted values (for interval targets)
from multiple predecessor models.
The
Gradient
Boosting node uses tree boosting to create a series of
decision trees that together form a single predictive model. Each
tree in the series is fit to the residual of the prediction from the
earlier trees in the series. The residual is defined in terms of the
derivative of a loss function. For squared error loss with an interval
target, the residual is simply the target value minus the predicted
value. Boosting is defined for binary, nominal, and interval targets.
The
LARS node
enables you to use Least Angle Regression algorithms to perform variable
selection and model fitting tasks. The
LARs node
can produce models that range from simple intercept models to complex
multivariate models that have many inputs. When the
LARs node
is used to perform model fitting, it uses criteria from either least
angle regression or the LASSO regression to choose the optimal model.
The
MBR (Memory-Based
Reasoning) node enables you to identify similar cases and to apply
information that is obtained from these cases to a new record. The
MBR node
uses k-nearest neighbor algorithms to categorize or predict observations.
The
Model
Import node enables you to import models into the SAS
Enterprise Miner environment that were not created by SAS Enterprise
Miner. For example, models that were created by using SAS PROC LOGISTIC
can now be run, assessed, and modified in SAS Enterprise Miner.
The
Neural
Network node enables you to construct, train, and validate
multilayer feedforward neural networks. Users can select from several
predefined architectures or manually select input, hidden, and target
layer functions and options.
The
Partial
Least Squares node is a tool for modeling continuous
and binary targets based on
SAS/STAT PROC PLS. The
Partial
Least Squares node produces DATA step score code and
standard predictive model assessment results.
The
Regression node
enables you to fit both linear and logistic regression models to your
data. You can use continuous, ordinal, and binary target variables.
You can use both continuous and discrete variables as inputs. The
node supports the stepwise, forward, and backward selection methods.
A point-and-click interaction builder enables you to create higher-order
modeling terms.
The
Rule
Induction node enables you to improve the classification
of rare events in your modeling data. The
Rule Induction node
creates a Rule Induction model that uses split techniques to remove
the largest pure split node from the data. Rule Induction also creates
binary models for each level of a target variable and ranks the levels
from the most rare event to the most common. After all levels of
the target variable are modeled, the score code is combined into a
SAS DATA step.
The
SVM node
uses supervised machine learning to perform binary classification
problems, including polynomial, radial basis function, and sigmoid
nonlinear kernels. The standard SVM problem solves binary classification
problems by constructing a set of hyperplanes that maximize the margin
between two classes. The
SVM node does not
support multi-class problems or support vector regression.
The
TwoStage node
enables you to compute a two-stage model for predicting a class and
interval target variables at the same time. The interval target variable
is usually a value that is associated with a level of the class target.