The Gradient Boosting
node uses a partitioning algorithm to search for an optimal
partition of the data for a single target variable. Gradient boosting
is an approach that resamples the analysis data several times to generate
results that form a weighted average of the resampled data set. Tree
boosting creates a series of decision trees that form a single predictive
model.
Like decision trees,
boosting makes no assumptions about the distribution of the data.
Boosting is less prone to overfit the data than a single decision
tree. If a decision tree fits the data fairly well, then boosting
often improves the fit. For more information about the Gradient Boosting
node, see the SAS Enterprise Miner help documentation.
To create a gradient
boosting model of the data:
-
Select the
Model tab on the Toolbar.
-
Select the Gradient
Boosting node icon. Drag the node into the Diagram Workspace.
-
Connect the Control
Point node to the Gradient Boosting node.
-
Select the Gradient
Boosting node. In the Properties Panel, set the following properties:
-
Click on the value for the
Maximum Depth property, in the
Splitting
Rule subgroup, and enter
10
. This property determines the number of generations in each decision
tree created by the Gradient Boosting node.
-
Click on the value for the
Number of Surrogate Rules property, in the
Node subgroup, and enter
2
. Surrogate rules are backup rules that are used in the event of
missing data. For example, if your primary splitting rule sorts donors
based on their ZIP codes, then a reasonable surrogate rule would sort based
on the donor’s city of residence.
-
In the Diagram Workspace,
right-click the Gradient Boosting node, and select
Run from the resulting menu. Click
Yes in the
confirmation window that opens.
-
In the
Run
Status window, select
OK.