There
are three cases where SAS Enterprise Miner uses a SAS grid:
-
during model training, for parallel
execution of nodes within a model training flow
-
during model training, for load
balancing of multiple flows from multiple data modelers
-
during model scoring, for parallel
batch scoring
The workflow
for SAS Enterprise Miner during the model training phase consists
of executing a series of different models against a common set of
data. Model training is CPU- and I/O-intensive. The process flow
diagram design of SAS Enterprise Miner lends itself to processing
on a SAS grid, because each model is independent of the other models.
SAS Enterprise Miner generates the SAS program to execute the user-created
flow, and also automatically inserts the syntax needed to run each
model on the grid. Because the models can execute in parallel on the
grid, the entire process is accelerated.
In addition,
SAS Enterprise Miner is typically used by multiple users who are simultaneously
performing model training. Using a SAS grid can provide multi-user
load balancing of the flows that are submitted by these users, regardless
of whether the flows contain parallel subtasks.
The output
from training a model is usually Base SAS code that is known as scoring
code. The scoring code is a model, and there are usually many models
that need to be scored. You can use SAS Grid Manager to score these
models in parallel. This action accelerates the scoring process. You
can use any of these methods to perform parallel scoring:
-
Use the grid wrapper code to submit
each model independently to the SAS grid.
-
Use the Schedule Manager plug-in
to create a flow that contains multiple models and schedule the flow
to the SAS grid. Because each model is independent, they are distributed
across the grid when the flow runs.
-
Use SAS Data Integration Studio
to create a flow to loop multiple models, which spawns each model
to the SAS grid.
Grid Processing with SAS Enterprise Miner