Using Scoring Functions to Run Scoring Models

How to Run a Scoring Model Using Scoring Functions

The %INDNZ_PUBLISH_MODEL macro creates the files that are needed to build the scoring functions and publishes those files to a specified database in the Netezza data warehouse. Only the EM_ output variables are published as Netezza scoring functions. For more information about the EM_ output variables, see Fixed Variable Names.
To run the scoring model using scoring functions, follow these steps:
  1. Run the %INDNZ_PUBLISH_MODEL macro.
    The %INDNZ_PUBLISH_MODEL macro uses some of the files that are created by the SAS Enterprise Miner Score Code Export node: the scoring model program (score.sas file), the properties file (score.xml file), and (if the training data includes SAS user-defined formats) a format catalog.
    The %INDNZ_PUBLISH_MODEL macro performs the following tasks:
    • takes the score.sas and score.xml files and produces a set of .c, .cpp, and .h files. These .c, .cpp, and .h files are necessary to build separate scoring functions for each of a fixed set of quantities that can be computed by the scoring model code.
    • processes the format catalog and creates an .h file with C structures if a format catalog is available. This file is also necessary to build the scoring functions.
    • produces a script of the Netezza commands that are necessary to register the scoring functions on the Netezza data warehouse.
    • transfers the .c, .cpp, and .h files to Netezza using the Netezza External Table interface.
    • calls the SAS_COMPILEUDF function to compile the source files into object files and to access the SAS formats library.
    • uses the SAS/ACCESS Interface to Netezza to run the script to create the scoring functions with the object files.
    For more information, see Running the %INDNZ_PUBLISH_MODEL Macro. For more information about the scoring functions, see Scoring Function Names and Viewing the Scoring Functions.
  2. Use the scoring functions in any SQL query.

Scoring Function Names

The names of the scoring functions that are built in Netezza have the following format:
modelname_EM_outputvarname
modelname is the name that was specified in the MODELNAME argument of the %INDNZ_PUBLISH_MODEL macro. modelname is always followed by _EM_ in the scoring function name. For more information about the MODELNAME argument, see Running the %INDNZ_PUBLISH_MODEL Macro.
outputvarname is derived from the names of the EM_ output variables in the score.xml file that is generated from the SAS Enterprise Miner Score Code Export node. For more information about the score.xml file, see Fixed Variable Names.
One scoring function is created for each EM_ output variable in the score.xml file. For example, if the scoring model DATA step program takes ten inputs and creates three new variables, then three scoring functions are defined. Each scoring function has the name of an output variable. For example, if you set MODELNAME=credit in the %INDNZ_PUBLISH_MODEL macro, and the EM_ output variables are “EM_PREDICTION”, “EM_PROBABILITY”, and “EM_DECISION”, then the name of the scoring functions that are created would be “credit_EM_PREDICTION”, “credit_EM_PROBABILITY”, and “credit_EM_DECISION”.
Note: The scoring function name cannot exceed 128 characters.
CAUTION:
When the scoring function is generated, the names are case insensitive.
Consequently, if you have model names “Model01” and “model01”, and you create two scoring functions, the second scoring function overwrites the first scoring function.

Viewing the Scoring Functions

There are four ways to see the scoring functions that are created:
  • From Netezza, log on to the database using a client tool such as NZSQL and submit an SQL statement. The following example assumes that the model name that you used to create the scoring functions is mymodel.
    nzsql database username password
      
     select function,createdate,functionsignature from _v_function where 
       function like '%MYMODEL%'
  • From SAS, use SQL procedure code that produces output in the LST file. The following example assumes that the model name that you used to create the scoring functions is mymodel.
    proc sql noerrorstop;
      connect to netezza (server=servername database=database 
        username=username password=password);
      select *
         from connection to netezza
         (select function,createdate,functionsignature
          from _v_function where
           function like '%MYMODEL%');
       disconnect from netezza;
    quit;
    
  • You can look at the SAS log that is created when the %INDNZ_PUBLISH_MODEL macro was run. A message is printed to the SAS log that indicates whether a scoring function is successfully or not successfully created or replaced.
  • Look at the SampleSQL.txt file that is produced when the %INDNZ_PUBLISH_MODEL macro is successfully run. This file can be found in the output directory (OUTDIR argument) that you specify in the macro.
    The SampleSQL.txt file contains basic SQL code that can be used to run your score code inside Netezza. Please note that you must modify the sample code before using it. Otherwise, the sample code returns an error.
    For example, the SampleSQL.txt file refers to an ID column in allmush1_intab that is populated with a unique integer from 1 to n. n is the number of rows in the table.
    Note: The ID column uniquely identifies each row. You would replace the ID column with your own primary key column.
    The following example assumes that the model name that you used to create the scoring functions is allmush1.
    drop table allmush1_outtab;
    create table allmush1_outtab(
     id integer
    ,"EM_CLASSIFICATION" varchar(33)
    ,"EM_EVENTPROBABILITY" float
    ,"EM_PROBABILITY" float
    );
    insert into allmush1_outtab(
     id
    ,"EM_CLASSIFICATION"
    ,"EM_EVENTPROBABILITY"
    ,"EM_PROBABILITY"
    )
    select id,
     allmush1_em_classification("BRUISES"
    ,"CAPCOLOR"
    ,"GILLCOLO"
    ,"GILLSIZE"
    ,"HABITAT"
    ,"ODOR"
    ,"POPULAT"
    ,"RINGNUMB"
    ,"RINGTYPE"
    ,"SPOREPC"
    ,"STALKCBR"
    ,"STALKROO"
    ,"STALKSAR"
    ,"STALKSHA"
    ,"VEILCOLO")
      as "EM_CLASSIFICATION",
     allmush1_em_eventprobability("BRUISES"
    ,"CAPCOLOR"
    ,"GILLCOLO"
    ,"GILLSIZE"
    ,"HABITAT"
    ,"ODOR"
    ,"POPULAT"
    ,"RINGNUMB"
    ,"RINGTYPE"
    ,"SPOREPC"
    ,"STALKCBR"
    ,"STALKROO"
    ,"STALKSAR"
    ,"STALKSHA"
    ,"VEILCOLO")
      as "EM_EVENTPROBABILITY",
     allmush1_em_probability("BRUISES"
    ,"CAPCOLOR"
    ,"GILLCOLO"
    ,"GILLSIZE"
    ,"HABITAT"
    ,"ODOR"
    ,"POPULAT"
    ,"RINGNUMB"
    ,"RINGTYPE"
    ,"SPOREPC"
    ,"STALKCBR"
    ,"STALKROO"
    ,"STALKSAR"
    ,"STALKSHA"
    ,"VEILCOLO")
      as "EM_PROBABILITY"
    from allmush1_intab ;
    

Using Scoring Functions to Run a Scoring Model

The scoring functions are available to use in any SQL expression in the same way that Netezza built-in functions are used.
After the scoring functions are created, they can be invoked in Netezza using SQL, as illustrated in the following example. Each output value is created as a separate function call in the select list. The SampleSQL.txt file shown in Viewing the Scoring Functions was modified to create the SELECT statement in this example.
select id, allmush1_em_classification
(
"BRUISES"
,"CAPCOLOR"
,"GILLCOLO"
,"GILLSIZE"
,"HABITAT"
,"ODOR"
,"POPULAT"
,"RINGNUMB"
,"RINGTYPE"
,"SPOREPC")
  as "EM_CLASSIFICATION",
 from allmush1_intab ;
Note: The function and table names must be fully qualified if the function and table are not in the same database.