IMSTAT Procedure

SCORE Statement

The SCORE statement applies scoring rules to an in-memory table. The results can take different forms. They can be displayed as tables in the SAS session, output data sets on the client, or temporary tables in the server. The most common use of the SCORE statement is to execute DATA step code on some or all rows of an in-memory table and to produce corresponding output records.

Syntax

SCORE CODE=file-reference <options>;

Required Argument

CODE=file-reference

specifies a file reference to the SAS program that performs the scoring.

Alias PGM=

SCORE Statement Options

KEEP=(variable-list)

KEEP=variable-name

specifies one or more variables that you want to transfer from the input data to the scoring results. You can use _ALL_ for all variables, _NUMERIC_ for all numeric variables, and other valid variable list names. If this option is not specified, then no variables are transferred from the input data (the table that is being scored), unless they are assigned values in the scoring code.

Alias TABLEVARS=

NOPREPARSE

specifies to prevent pre-parsing and pre-generating the program code that is referenced in the CODE= option. If you know the code is correct, you can specify this option to save resources. The code is always parsed by the server, but you might get more detailed error messages when the procedure parses the code rather than the server. The server assumes that the code is correct. If the code fails to compile, the server indicates that it could not parse the code, but not where the error occurred.

Alias NOPREP

OUT=libref.member-name

specifies the name of an output data set in which to store the scoring results. If the result set contains variables that match those in the input data set, then format information is transferred to the output data set. The OUT= option and the TEMPTABLE option are mutually exclusive. If you specify the OUT= option, a temporary table is not created in the server.

Alias DATA=

PARTITION <=partition-key>

specifies to take advantage of partitioning for tables that are partitioned. When this option is specified, the scoring code is executed in the order of the partitions. If the data are also ordered within the partition, the observations are processed in that order. If the scoring code uses the reserved symbols __first_in_partition or __last_in_partition, then the data are also processed in partitioned order. Although the observations are processed in a specific order, the execution occurs in concurrent threads (in parallel). Different threads are assigned to work on different partitions.

If you do not specify the optional partition-key, then the analysis is performed for all partitions. If you do specify a partition-key, then the analysis is performed for the partitions that match the specified key value only. You can use the PARTITIONINFO statement to retrieve the valid partition-key values for a table.
You can specify a partition-key in two ways. You can supply a single quoted string that is passed to the server, or you can specify the elements of a composite key separated by commas. For example, if you partition a table by variables GENDER and AGE, with formats $1 and BEST12, respectively, then the composite partition key has a length of 13. You can specify the partition for the 11 year-old females as follows:
score / partition="F          11"; /* passed directly to the server */
score / partition="F","11";        /* composed by the procedure */
If you choose the second format, the procedure composes a key based on formatting information from the server.
Alias PART=
Interaction This option is effective when used with partitioned in-memory tables only.

SAVE=table-name

saves the result table so that you can use it in other IMSTAT procedure statements like STORE, REPLAY, and FREE. The value for table-name must be unique within the scope of the procedure execution. The name of a table that has been freed with the FREE statement can be used again in subsequent SAVE= options.

SETSIZE

requests that the server estimate the size of the result set. The procedure does not create a result table if the SETSIZE option is specified. Instead, the procedure reports the number of rows that are returned by the request and the expected memory consumption for the result set (in KB). If you specify the SETSIZE option, the SAS log includes the number of observations and the estimated result set size. See the following log sample:

NOTE: The LASR Analytic Server action request for the STATEMENT
      statement would return 17 rows and approximately
      3.641 kBytes of data.
The typical use of the SETSIZE option is to get an estimate of the size of the result set in situations where you are unsure whether the SAS session can handle a large result set. Be aware that in order to determine the size of the result set, the server has to perform the work as if you were receiving the actual result set. Requesting the estimated size of the result set does consume resources on the server. The estimated number of KB is very close to the actual memory consumption of the result set. It might not be immediately obvious how this size relates to the displayed table, since many tables contain hidden columns. In addition, some elements of the result set might not be converted to tabular output by the procedure.

SYMBOLS=(symbol-list)

specifies one or more symbols that are calculated in the scoring code that you want to transfer as columns to the scoring results. If the SYMBOLS= option is not specified, then all symbols that are assigned values in the program—and that are not just placeholders for intermediate calculations—are transferred to the results. If you use a large program with many assignments, you might want to use the SYMBOLS= option to limit the columns in the results.

Alias SYM=

TEMPTABLE

generates an in-memory temporary table from the result set. The IMSTAT procedure displays the name of the table and stores it in the _TEMPLAST_ macro variable, provided that the statement executed successfully.

When the IMSTAT procedure exits, all temporary tables created during the IMSTAT session are removed. Temporary tables are not displayed on a TABLEINFO request, unless the temporary table is the active table for the request.

Details

To help manage how output is generated, options in the SCORE statement can be brought to bear together with special syntax elements in the scoring code. For example, the PARTITION<=> option can be used to specify that the scoring code is executed separately for each partition or for a specific partition of the data only. If you want to control precisely which observations are used to generate output records, you can use the __lasr_output symbol in your SAS program. When this symbol is set to 1, the row is output. You can also use the __first_in_partition and __last_in_partition variables to programmatically determine the first and last observation in a partition.
The following SAS code is an example:
__lasr_output = 0;
if __first_in_partition then do; 1
   totalmsrp = msrp;
   minmsrp   = msrp;
   numCars   = 1;
end; else do;
   totalmsrp + msrp;
   numCars   + 1;
   if (msrp < minmsrp) then minmsrp = msrp;
end;
orgdrive = Origin || drivetrain; 2
mpgdiff = mpg_highway - mpg_city;
if __last_in_partition then 3
   __lasr_output = 1;
1 For the first observation within a partition, three variables are initialized. The minimum MSRP, the total MSRP, and the number of records in the partition are then computed.
2 The variable ORGDRIVE is obtained by concatenating the strings of the ORIGIN and DRIVETRAIN variables
3 When the last record within the partition is reached, the __lasr_output automatic variable is set to 1, this is used to add of the current record to the result set.
The execution of the SCORE code observes the active WHERE clause in the IMSTAT run block—in other words, the scoring code is executed only for those observations that meet the WHERE condition if a WHERE clause is active.
The following example loads the SASHELP.CARS data set partitioned by the TYPE variable, and executes the previous code sample.
data lasrlib.cars(partition=(type));
    set sashelp.cars;
run;

filename fref '/path/to/scorepgm.sas';

proc imstat;
   table lasrlib.cars;
   score pgm=fref / partition;
run;
The PARTITION option in the SCORE statement requests the server to execute the code separately for each partition of the data. Because the code outputs one observation per partition and there are six unique values of the TYPE variable in the SASHELP.CARS data set, the scoring results show six rows:
                Scoring Results for Table WORK.CARS
 
 totalmsrp     minmsrp      numCars  orgdrive          mpgdiff  Type

     59760       19110     3.000000  Asia  Front     -8.000000  Hybrid
   2087415       17163    60.000000  EuropeAll        5.000000  SUV   
   7800688       10280   262.000000  EuropeFront      7.000000  Sedan 
   2615966       18345    49.000000  Asia  Rear       6.000000  Sports
    598593       12800    24.000000  Asia  All        3.000000  Truck 
    865216       11905    30.000000  EuropeAll        7.000000  Wagon 
The SCORE statement does not support the temporary expressions that are available in other IMSTAT statements. This is because you can compute all necessary temporary variables in the scoring code.