The Gradient Boosting
node uses a partitioning algorithm to search for an optimal
partition of the data for a single target variable. Gradient boosting
is an approach that resamples the analysis data several times to generate
results that form a weighted average of the resampled data set. Tree
boosting creates a series of decision trees that form a single predictive
model.
Like decision trees,
boosting makes no assumptions about the distribution of the data.
Boosting is less prone to overfit the data than a single decision
tree. If a decision tree fits the data fairly well, then boosting
often improves the fit. For more information about the Gradient Boosting
node, see the SAS Enterprise Miner help documentation.
To create a gradient
boosting model of the data:
-
Select the
Model tab on the Toolbar.
-
Select the Gradient
Boosting node icon. Drag the node into the Diagram Workspace.
-
Connect the Control
Point node to the Gradient Boosting node.
-
Select the Gradient
Boosting node. In the Properties Panel, set the following properties:
-
Click on the value for the
Maximum Depth property, in the
Splitting
Rule subgroup, and enter
10. This property determines the number of generations in each decision
tree created by the Gradient Boosting node.
-
Click on the value for the
Number of Surrogate Rules property, in the
Node subgroup, and enter
2. Surrogate rules are backup rules that are used in the event of
missing data. For example, if your primary splitting rule sorts donors
based on their ZIP codes, then a reasonable surrogate rule would sort based
on the donor’s city of residence.
-
In the Diagram Workspace,
right-click the Gradient Boosting node, and select
Run from the resulting menu. Click
Yes in the
confirmation window that opens.
-
In the
Run
Status window, select
OK.