SASEPFUNC Table Function

Overview of the SASEPFUNC Table Function

The SASEPFUNC table function is the interface for running the scoring model inside Oracle with the SAS Embedded Process. The SASEPFUNC table function performs the scoring based on the parameters that are passed to it. It uses the .ds2 and .ufmt XML files that are stored in the model table.
This function is created by the SASADMIN user when the in-database deployment package is installed and configured. For more information, see the SAS In-Database Products: Administrator's Guide.

Using the SASEPFUNC Table Function

You can use the SASEPFUNC table function using explicit pass-through and PROC SQL or you can use other Oracle query tools such as SQLPlus. Use the SASEPFUNC function in the FROM clause in any SQL expression to run the scoring model.
Tip
Look at the SampleSQL.txt file is produced when the %INDOR_PUBLISH_MODEL macro is successfully run. This file can be found in the output directory (OUTDIR argument) that you specify in the %INDOR_PUBLISH_MODEL macro. The SampleSQL.txt file contains basic SQL code that can be used to run your score code inside Oracle. Please note that you must modify the sample code before using it. Otherwise, the sample code returns an error.
Tip
The SampleSQL.txt file refers to an ID column in the example table that is populated with a unique integer from 1 to n. n is the number of rows in the table. The ID column uniquely identifies each row. You would replace the ID column with your own primary key column.
Note: Before using the SASEPFUNC table function with the SAS Embedded Process, you must create the model table with the %INDOR_CREATE_MODELTABLE macro. Then, you must publish the files to the model table with the %INDOR_PUBLISH_MODEL macro. For more information, see Creating a Model Table and Running the %INDOR_PUBLISH_MODEL Macro.
Here is an example using PROC SQL.
proc sql;
connect to oracle (user=userid password=xxxx path=mydatabase);
create table work.sas_score_out1 as select * from connection to oracle
   (SELECT * from table(SASEPFUNC(CURSOR(select * from intrtree), 
      'myschema.SAS_PUBLISH_MODEL', 'INTRREG1', 'null',
      'select * from intrtree'));
disconnect from oracle;quit;

SASEPFUNC Table Function Syntax

The syntax of the SASEPFUNC table function is as follows.
FROM TABLE (SASEPFUNC(
CURSOR (SELECT /* + PARALLEL(table-alias, dop) */
* FROM input-table table-alias),
'schema-name.model-table-name',
'model-name', 'null',
'SELECT * FROM 'input-table' ))
Arguments

CURSOR(SELECT /*PARALLEL (table-alias, dop) */ * FROM input-table table-alias)

specifies the SELECT statement to read from the input table.

Tip You can specify a hint for the degree of parallelism (dop) value that is used for reading the input table. For more information, see Run-Time Guidance for the Oracle Degree of Parallelism (DOP) Setting.

table-alias

specifies an alias for the input table name.

Requirement The table alias must be the same in the parallel hint and the FROM clause.

input-table

specifies the name of the input table that is used by the SASEPFUNC table function.

schema-name.model-table-name

specifies the fully qualified model table name.

Requirement The table name that you specify for this function must be the same table name that is used in the %INDOR_CREATE_MODELTABLE macro. For more information, see Creating a Model Table.

model-name

specifies the model name.

Requirement The model name must be the same name that is used in the %INDOR_PUBLISH_MODEL macro. For more information, see %INDOR_PUBLISH_MODEL Macro Syntax.

null

specifies a placeholder value at this time

Tip You can specify either 'null' or an empty string ' '.

SELECT * FROM input-table

specifies a simple SELECT statement for the input table.

Requirement The input table name must be the same in the first SELECT statement.

Run-Time Guidance for the Oracle Degree of Parallelism (DOP) Setting

The performance of the Scoring Accelerator for Oracle can be affected by altering the Degree of Parallelism (DOP) setting. The DOP setting defines how many parallel processes work on a single statement. In a Real Application Clusters (RAC) environment, the parallel processes are distributed among the available database instances when the chosen DOP exceeds the expected capabilities of a single node. In environments with mixed workloads and multi-concurrency, you should rely on the parallelism provided by the Oracle database. However, you might want to consider adjusting the DOP setting to achieve maximum throughput for dedicated Scoring Accelerator operations.
Because Oracle and SAS use separate threads during execution, improvements throughput diminish for DOP values that are greater than half the total number of cores available. For example, if you have 128 cores total available for all instances, a DOP greater than 64 is not likely to yield improved performance. Performance is not improved because both the Oracle and SAS processes tend to be CPU intensive. Setting the DOP up to this maximum level assumes that the system is solely dedicated to running the SAS Scoring Accelerator. For a mixed load system, a lower DOP value might be more appropriate.
In RAC environments, Oracle allocates parallel execution servers based on an internal load-balancing algorithm. This allocation ensures approximately average loads across all nodes that are accessible for a given parallel operation. Because the load of the SAS processes is not compensated for in Oracles internal algorithms, it can be beneficial in some environments to change Oracles default behavior. There are two ways of doing so:
  • Disable Oracles internal load balancing: this can be accomplished by setting the internal parameter _parallel_load_balancing to FALSE (the default value of this parameter is TRUE). Oracle then does a plain round-robin allocation of processes across all available nodes. This parameter can be changed on a system and session level.
  • Adjust the number of parallel execution servers per load balance unit. The load balance unit is chosen internally by Oracle to ensure a maximum co-location of parallel execution servers on a single node. This unit is dependent on the number of available CPUs in a system. You can decrease the unit by setting the internal Oracle parameter _parallel_load_bal_unit. The default value of this parameter is 0, meaning the system internally calculates this value. Similar to DOP, setting the _parallel_load_bal_unit parameter beyond half the core total per instance is not likely to be beneficial.
CAUTION:
Influencing Oracles internal load balancing for parallel execution does not harm a system in any way.
However, influencing (changing) Oracles internal algorithms even in a single session has an impact on the overall system through the different allocation of parallel processing resources across the cluster. You have to test your adjustments for possible run time and performance impacts of the overall system.