The following list highlights
some of the advantages and disadvantages for each approach:
-
If you know that you want to run
code in the server, then using the SCORE statement to the IMSTAT procedure
provides assurance. Either the code runs in the server and the statement
succeeds or the statement fails. Large tables are never transferred
from the server to the client for processing.
-
If you want to use the
__first_in_partition
or
__last_in_partition
variables, you must use the SCORE statement. Those variables can be used to provide
some similarity to
BY-group processing with FIRST. and LAST. variables.
-
To use the DATA step and keep processing
in the server, your program must use the DSACCEL=ANY system option
and meet the requirements that are commonly associated with scoring.
In this case, the DATA step determines that the requirements are met
and performs the optimization of running in the server.
-
Error messages are more helpful
and troubleshooting is easier when you are using the DATA step than
when you are running a program with the SCORE statement.
As a best practice,
initial development with the DATA step provides rapid development.
After the program is complete, or nearly so, switching to run the
program with the SCORE statement provides assurance that the program
runs in the server.
When you work with the
SCORE action, use the IMSTAT procedure’s PGMMSG option so that
the server records any error information. You can read this information
in the &_PGMMSG_ temporary table. PUT statements in your program
are written to that table rather than the SAS log.
As a final consideration,
before writing a DATA step program, check the list of alternatives
in the preceding section to see whether a statement for the IMSTAT
procedure can accomplish your goal. The GROUPBY, AGGREGATE, and TRANSFORM
statements can be used to perform data preparation on in-memory tables.