Scoring Functions inside the DB2 Database

Scoring Function Names

The names of the scoring functions that are built in DB2 have the following format:
modelname_EM_outputvarname
modelname is the name that was specified in the MODELNAME argument of the %INDB2_PUBLISH_MODEL macro. modelname is always followed by _EM_ in the scoring function name. For more information about the MODELNAME argument, see %INDB2_PUBLISH_MODEL Macro Syntax.
outputvarname is derived from the names of the EM_ output variables in the score.xml file that is generated from the SAS Enterprise Miner Score Code Export node. For more information about the score.xml file, see Fixed Variable Names.
One scoring function is created for each EM_ output variable in the score.xml file. For example, if the scoring model DATA step program takes ten inputs and creates three new variables, then three scoring functions are defined, each with the name of an output variable. For example, if you set MODELNAME=credit in the %INDB2_PUBLISH_MODEL macro and the EM_ output variables are “EM_PREDICTION”, “EM_PROBABILITY”, and “EM_DECISION”, then the name of the scoring functions that are created would be “credit_EM_PREDICTION”, “credit_EM_PROBABILITY”, and “credit_EM_DECISION”.
Note: A scoring function name cannot exceed 128 characters.
CAUTION:
When the scoring function is generated, the names are case insensitive.
Consequently, if you have model names “Model01” and “model01”, and you create two scoring functions, the second scoring function overwrites the first scoring function.

Using the Scoring Functions

The scoring functions are available to use in any SQL expression in the same way that DB2 built-in functions are used. For an example, see Model Publishing Macro Example.
There are four ways to see the scoring functions that are created:
  • From DB2, log on to the database using the DB2 client tool (command line processor) and submit an SQL statement. The following example assumes that the model name that you used to create the scoring functions is mymodel and the DB2 installation instance is located in /users/db2v9. The first line of code executes a db2profile script. The script sets the DB2 environment variables so the DB2 command line processor (CLP) can execute.
    >./users/db2v9/sqllib/db2profile
    >db2
    db2 => connect to database user username using password
    db2 => select * from syscat.functions where funcname like '%MYMODEL%'
  • From SAS, use SQL procedure code that produces output in the LST file. The following example assumes that the model name that you used to create the scoring functions is mymodel.
    proc sql noerrorstop;
      connect to db2 (user=username pw=password db=database);
      
    select *
       from connection to db2
          (select * from syscat.functions where funcname like '%MYMODEL%');
       disconnect from db2;
    quit;
    
    You can also use the SASTRACE and SASTRACELOC system options to generate tracing information. For more information about these system options, see the SAS System Options: Reference.
  • Look at the SampleSQL.txt file that is produced when the %INDB2_PUBLISH_MODEL macro is successfully run. This file can be found in the output directory (OUTDIR argument) that you specify in the macro.
    The SampleSQL.txt file contains basic code that, with modifications, can be used to run your score code inside DB2.
    For example, the SampleSQL.txt file refers to an ID column in allmush1_intab that is populated with a unique integer from 1 to n, with n being the number of rows in the table. The ID column uniquely identifies each row. You would replace the ID column with your own primary key column.
    Note: The function and table names must be fully qualified if the function and table are not in the same schema.
    The following example assumes that the model name that you used to create the scoring functions is allmush1.
    drop table allmush1_outtab;
    create table allmush1_outtab(
     id integer
    ,"EM_CLASSIFICATION" varchar(33)
    ,"EM_EVENTPROBABILITY" float
    ,"EM_PROBABILITY" float
    );
    insert into allmush1_outtab(
     id
    ,"EM_CLASSIFICATION"
    ,"EM_EVENTPROBABILITY"
    ,"EM_PROBABILITY"
    )
    select id,
     allmush1_em_classification("BRUISES"
    ,"CAPCOLOR"
    ,"GILLCOLO"
    ,"GILLSIZE"
    ,"HABITAT"
    ,"ODOR"
    ,"POPULAT"
    ,"RINGNUMB"
    ,"RINGTYPE"
    ,"SPOREPC"
    ,"STALKCBR"
    ,"STALKROO"
    ,"STALKSAR"
    ,"STALKSHA"
    ,"VEILCOLO")
      as "EM_CLASSIFICATION",
     allmush1_em_eventprobability("BRUISES"
    ,"CAPCOLOR"
    ,"GILLCOLO"
    ,"GILLSIZE"
    ,"HABITAT"
    ,"ODOR"
    ,"POPULAT"
    ,"RINGNUMB"
    ,"RINGTYPE"
    ,"SPOREPC"
    ,"STALKCBR"
    ,"STALKROO"
    ,"STALKSAR"
    ,"STALKSHA"
    ,"VEILCOLO")
      as "EM_EVENTPROBABILITY",
     allmush1_em_probability("BRUISES"
    ,"CAPCOLOR"
    ,"GILLCOLO"
    ,"GILLSIZE"
    ,"HABITAT"
    ,"ODOR"
    ,"POPULAT"
    ,"RINGNUMB"
    ,"RINGTYPE"
    ,"SPOREPC"
    ,"STALKCBR"
    ,"STALKROO"
    ,"STALKSAR"
    ,"STALKSHA"
    ,"VEILCOLO")
      as "EM_PROBABILITY"
    from allmush1_intab ;
    
  • You can look at the SAS log. A message that indicates whether a scoring function is successfully or not successfully executed is printed to the SAS log.