Variable importance is calculated based on how the variables are used in the finished tree. Three metrics are used: count, SSE, and relative importance. The count-based variable importance simply counts the number of times in the entire tree that a given variable is used in a split. The SSE and relative importance are calculated from the training set. They are also calculated again from the validation set if one exists. These are reported as “VSSE” and “VIMPORT.”
The SSE-based variable importance is based on the nodes in which the variable is used in a split. For each variable, the change of the SSE that results from the split is found. The change is
where denotes the node. is then the SSE if the node is treated as a leaf, and is the SSE of the node after it has been split. If the change in SSE is negative (which is possible when you use the validation set), then the change is set to 0.
The SSE-based importance is then
The relative importance metric is based on the SSE of each variable. The maximum SSE variable importance is found. Then all the variables are assigned a relative importance, which is simply