Scoring Files and Functions inside the Aster nCluster Database

Aster nCluster Scoring Files

The %INDAC_PUBLISH_MODEL macro produces three scoring files for each model:
  • sasscore_modelname.ds2. This file contains code that is executed by the SAS_SCORE() function.
  • sasscore_modelname_io.xml. This file contains the scoring model's input and output variables.
  • sasscore_modelname_ufmt.xml. This file contains user-defined formats for the scoring model that is being published.
For Aster nCluster 4.5, these files are stored in the NC_INSTALLED_FILES table under the PUBLIC schema. For Aster nCluster 4.6, these files are stored in the NC_USER_INSTALLED_FILES table under the schema that you specified in the INDCONN macro variable.See Scoring File Examples for an example of each of these files.
Note: When you publish a model using Aster 4.5, you are likely to receive warnings about multiple lengths and unbalanced quotation marks. This warning does not keep the model from being published successfully. The error occurs because the .ds2 scoring file is inserted into an Aster system table as a long quoted string.
There are four ways to see the scoring files that are created:
  • Log on to the database using the Aster nCluster command line processor and submit an SQL statement. The following example assumes that the model name that you used to create the scoring files is reg.
    >act -h hostname -u username -w password -d databasename
    >select name from nc_user_installed_files where name like '%sasscore_reg%';
    Three files are listed for each model:
        name
    -------------------------
     sasscore_reg.ds2
     sasscore_reg_io.xml
     sasscore_reg_ufmt.xml
  • From SAS, use SQL procedure code that produces output in the LST file. The following example assumes that the model name that you used to create the scoring functions is reg.
    proc sql noerrorstop;
      connect to aster (user=username password=password dsn=dsnname);
      
    select *
       from connection to aster
          (select filename, fileowner, uploadtime
             from nc_user_installed_files where
                name like 'sasscore_reg%');
       disconnect from aster;
    quit;
    
  • Look at the SampleSQL.txt file that is produced when the %INDAC_PUBLISH_MODEL macro is successfully run. This file can be found in the output directory (OUTDIR argument) that you specify in the %INDAC_PUBLISH_MODEL macro.
    The SampleSQL.txt file contains basic code that, with modifications, can be used to run your score code inside Aster nCluster.
    Note: The function and table names must be fully qualified if the functions and tables are not in the same database.
    For example, the SampleSQL.txt file refers to an ID column in allmush1_intab that is populated with a unique integer from 1 to n, with n being the number of rows in the table. The ID column uniquely identifies each row. You would replace the ID column with your own primary key column.
    The following example assumes that the model name that you used is reg.
    drop table score_outtab;
    create table score_outtab(
     id integer
    ,"EM_CLASSIFICATION" varchar(256)
    ,"EM_EVENTPROBABILITY" float
    ,"EM_PROBABILITY" float
    );
    insert into score_outtab(
     id
    ,"EM_CLASSIFICATION"
    ,"EM_EVENTPROBABILITY"
    ,"EM_PROBABILITY"
    )
    select id,
    "EM_CLASSIFICATION",
    "EM_EVENTPROBABILITY",
    "EM_PROBABILITY"
    from sas_score(on score_intab model('reg'));
    
  • Look at the SAS log that is created when the %INDAC_PUBLISH_MODEL macro was run. A message that indicates whether the scoring files are successfully or not successfully created is printed to the SAS log.

SAS_SCORE() Function

Overview of the SAS_SCORE() Function

The SAS_SCORE() function is an SQL/MR function that executes the scoring model running on the SAS Embedded Process in Aster nCluster. The SAS_SCORE() function is deployed and stored in the PUBLIC schema during the installation and configuration of the in-database deployment for Aster nCluster.
For more information about installing and configuring the in-database deployment package for Aster nCluster, see the SAS In-Database Products: Administrator's Guide.

Scoring Files Table

The NC_INSTALLED_FILES table contains the following columns. The ModelName column is the table key. The table is referenced by the two-level name model-name.model-table-name.
Column Name
Description
Specification
ModelName
contains the name of the model
VARCHAR(128) CHARACTER SET UNICODE CASESPECIFIC
ModelDS2
contains the sasscore_modelname.ds2 file
BLOB(209708800)
ModelFormats
contains the sasscore_modelname_ufmt.xml file
BLOB(209708800)
ModelOwner
contains the name of the user who published the model
VARCHAR(128) CHARACTER SET UNICODE CASESPECIFIC
ModelUpdated
contains the date and time that the model was published
TIMESTAMP(6)

Using the SAS_SCORE() Function

You can use the SAS_SCORE() function in the FROM clause in any SQL expression in the same way that Aster nCluster SQL/MR functions are used.
The syntax of the SAS_SCORE() function is as follows:
FROM SAS_SCORE(ON input-table MODEL('model-name')
<MODEL_SCHEMA('schema-name')>)
Arguments
input-table
specifies the input table that is used by the SAS_SCORE() function.
model-name
specifies the name of the model. The value of this argument is the same as the value of MODELNAME=name argument for the %INDAC_PUBLISH_MODEL macro.
schema-name
specifies the name of the schema where the scoring model files are published.
Restriction: This argument is valid only for Aster nCluster 4.6. For Aster nCluster 4.5, the scoring model files are published to the PUBLIC schema.
Default: your default schema. To determine your default schema name, use the show search_path command from the Aster Client Tool (ACT).
Here is an example of using the SAS_SCORE function. In this example, the input table is score_intab and the model name is reg.
select id, em_classification, em_eventprobability, em_probability 
   from sas_score (on score_intab model('reg') model_schema('mysch'));