Computer Resources

The computer time and memory required for an analysis depend on the number of cases, the number of variables, the complexity of the model, and the training algorithm. For many modeling methods, there is a trade-off between time and memory.
For all modeling nodes, memory is required for the operating system, SAS supervisor, and the Enterprise Miner diagram and programs, resulting in an overhead of about 20 to 30 megabytes.
The following notation will be used:
N
the number of cases.
V
the number of input variables.
I
the number of input terms or units, including dummy variables, intercepts, interactions, and polynomials.
W
the number of weights in a neural network.
O
the number of output units.
D
the average depth of a tree.
R
the number of times the training data are read in logistic regression or neural nets, which depends on the training technique, the termination criteria, the model, and the data. R is typically much larger for neural nets than for logistic regression. In regard to training techniques, R is usually smallest for Newton-Raphson or Levenberg-Marquardt, larger for quasi-Newton, and still larger for conjugate gradients.
S
the number of steps in a stepwise regression, or 1 if stepwise regression is not used.
For the Decision Tree node, the minimum additional memory required for an analysis is about 8N bytes. Training will be considerably faster if there is enough RAM to hold the entire data set, which is about 8N(V+1) bytes. If the data will not fit in memory, they must be stored in a utility file. Memory is also required to hold summary statistics for a node, such as means or a contingency table, but this amount is usually much smaller than the amount required for the data.
For the Regression node, the memory required depends on the type of model and on the training technique. For linear regression, memory usage is dominated by the SSCP matrix, which requires 8I2 bytes. For logistic regression, memory usage depends on the training technique as documented in the SAS/OR Technical Report: The NLP Procedure, ranging from about 40I bytes for the conjugate gradient technique to about 8I2 bytes for the Newton-Raphson technique.
For the Neural Network node, memory usage depends on the training technique as documented in the SAS/OR Technical Report: The NLP Procedure. About 40W bytes are needed for the conjugate gradient technique, while 4W2 bytes are needed for the quasi-Newton and Levenberg-Marquardt techniques. For a network with biases and H hidden units in one layer, W = (I+1)H + (H+1)O.
For both logistic regression and neural networks, the conjugate gradient technique, which requires the least memory, must usually read the training data many more times than the Newton-Raphson and Levenberg-Marquardt techniques.
Assuming that the number of training cases is greater than the number of inputs or weights, the time required for training is approximately proportional to:
NI2
for linear regression.
SRNI
for logistic regression using conjugate gradients.
SRNI2
for logistic regression using quasi-Newton or Newton-Raphson. Note that R is usually considerably less for these techniques than for conjugate gradients.
DNI
for decision tree-based models.
RNW
for neural networks using conjugate gradients.
RNW2
for neural networks using quasi-Newton or Levenberg-Marquardt. Note that R is usually considerably less for these techniques than for conjugate gradients.