Publish Models to a Database

Overview

SAS Model Manager enables you to publish the project champion model and challenger models that are associated with the DATA Step score code type to a configured database. Publishing PMML models to a database is not supported. SAS Model Manager uses the SAS Scoring Accelerator and SAS/ACCESS interface to the database to publish models to the database. The Scoring Accelerator takes the models from SAS Model Manager and translates them into scoring files or functions that can be deployed inside the database. After the scoring functions are published using the SAS/ACCESS interface to the database, the functions extend the database’s SQL language and can be used in SQL statements such as other database functions. After the scoring files are published, they are used by the SAS Embedded Process to run the scoring model.
The SAS administrator can enable the Publish Scoring Options setting in SAS Management Console to indicate that the metadata tables be populated in the target database when publishing a scoring function. If this setting is enabled and the scoring function publish method is chosen, the scoring metadata tables in the database are populated with information about the project and pointers to the scoring function. This feature enables users to review descriptions and definitions of the published model. The audit logs track the history of the model's usage and any changes that are made to the scoring project.
For more information, see Publishing Models to a Database in SAS Model Manager: User's Guide.
This tutorial shows you the tasks that are involved in publishing a project champion model or challenger model to a database. It contains examples and step-by-step directions about preparing a database for use with SAS Model Manager and publishing a model.
Note: The examples that are used in this tutorial use the Teradata database type for publishing a model to a database. You can also use this tutorial to publish a model to a DB2, Greenplum, Oracle, or Netezza database.

Publish a Model Using the SAS Embedded Process Publish Method

In this exercise, you publish a project's champion model to a database using the SAS Embedded Process publish method.
To publish a model to a database, follow these steps:
  1. Verify that you have set the project champion model. For more information, see Set the Champion Model.
  2. Right-click the Loan project and select Publish Modelsthen selectto a Database from the pop-up menu. The Publish Models to a Database window appears.
    Publish Models to a Database Window
  3. Select a database type and select the SAS Embedded Process for the publish method. The type of database and the publish method that you choose determine which database settings and options are required. The default publish method is SAS Embedded Process.
    Operating Environment Information: The Netezza database type for the SAS Embedded Process publish method is not supported in the second maintenance release of SAS 9.3. When the SAS Embedded process publish method is supported for Netezza, the SAS Administrator can enable Netezza support for SAS Model Manager so that the Netezza database type appears. For more information, see the SAS Model Manager: Administrator's Guide.
  4. Select the check box next to the Tree 1 champion model in the list.
  5. Enter a publish name for the champion model that you selected to publish or accept the default value. The SAS Embedded Process publish method uses only the publish name to publish the model files to the database.
    Here are the naming convention requirements:
    • The user-defined value is case insensitive. The maximum length of alphanumeric characters is determined by the database type and publish method that is selected. No spaces are allowed. An underscore is the only special character that can be included in the publish name.
    • The recommended maximum length of the publish name for the SAS Embedded Process publish method is 30 alphanumeric characters for all database types. The database types that are currently supported by SAS Model Manager are Teradata, Oracle, Greenplum, and DB2.
    Note: The publish name for each model is reserved by default for subsequent use of the publishing models for a project.
  6. Enter a value for the database settings that appear for the selected database type and publish method.
    Here are the available database settings according to the database type:
    Database Settings
    Database Type
    Database server
    • Teradata
    • Oracle
    • Netezza
    • Greenplum
    • DB2
    Database
    • Teradata
    • Oracle
    • Netezza
    • Greenplum
    • DB2
    User ID
    • Teradata
    • Oracle
    • Netezza
    • Greenplum
    • DB2
    Password
    • Teradata
    • Oracle
    • Netezza
    • Greenplum
    • DB2
    Schema
    • Oracle
    • Greenplum
    • DB2
    Publish Models to Database fields completed
  7. Click More Options. The <Database-type> Options window appears.
    Database Options window
    Select the check box for the desired validation options that appear for the selected database type:
    • Validate scoring results
    • Keep scoring files if validation fails
    • Display detailed log messages
    • Use model input
    Note: By default, the Validate scoring results and Use model input options are selected.
  8. Enter a numeric value for Sample Size. The default value for sample size is 100 if the value is null or zero. The maximum number of digits that are allowed is 8.
    Database Options window
  9. Click OK. A message is displayed to indicate whether the models were published to the database successfully or not.
    Publish Name Verification Message
    Note: The value of the publish name is validated against the target database, when the option Replace scoring files that have the same publish name is not selected for the SAS Embedded Process publish method. If the publish name is not unique, an error message is displayed.
  10. Click Close to complete the publishing process. The SAS code, scoring results log, and output for the SAS Embedded Process are placed in the Publish Resultsthen select<publish-model-name> folder in the project file list, and the ModelNameForEP user-defined project property is populated.
    User-Defined Project Properties
  11. The actions that are performed during the publishing process are displayed in the history. To view the history of the project, select the project name and then click the History tab in the Annotations - Loan pane.
  12. To view the publish history, select the Loan project, and select the Publish History tab. Select an item from the list to view the publish details
Note: After you have completed the publishing process, you can view the log file. The Publish Results folder in the Project Tree contains a folder for each model that was published. The publish name is used to create a folder for each model that is published. That folder contains the ScoringResults.log file. The time that the process started, details about who initiated the process, and the time when the project was published are recorded. Error messages are also recorded in the log file. The log file provides an audit trail of all relevant actions in the publishing process.
Publish Results for model

Publish a Model Using the Scoring Function Publish Method

In this exercise, you publish a project’s champion model to a database using the scoring function publish method.
To publish a model, follow these steps:
  1. Verify that you have set the project champion model in Tutorial 3. For more information, see Set the Champion Model.
  2. (Optional) Select the Loan project folder and enter a value for the DbmsTable user-defined property. This value is the scoring input table that the DBA might have created in the database to be used with a scoring application.
    Note: If you plan to use scoring application or SQL code to score this project, you must first set the DbmsTable property to the name of input table in your database that you want to use for scoring the champion or challenger model. When you publish a scoring function, the information that is associated with the input table in the database is updated to contain the value of the DbmsTable property. The scoring application or SQL code can then query the database for the input table name to use as the scoring input table.
    DbmsTable Project Property
  3. Right-click the Loan project in the Project Tree and select Publish Modelsthen selectto a Database. The Publish Models to a Database window appears.
    Publish Models to Database window
  4. Select a database type and select the scoring function for the publish method. The type of database and the publish method that you choose determine which database settings and options are required.
    Publish Models to Database window for scoring function method
  5. Select the check box next to the Reg 1 challenger model in the list.
    Tip
    If you have not published the champion model yet, select the champion model Tree 1 as well.
  6. Enter a publish name for each model that you selected to publish. The scoring function publish method has a system-generated prefix and a publish name. These are used to publish the model scoring function. The publish name is a user-defined value that can be modified.
    Here are the naming convention requirements:
    • The user-defined value is case insensitive. The maximum length of alphanumeric characters is determined by the database type and publish method that is selected. No spaces are allowed. An underscore is the only special character that can be included in the publish name.
    • The recommended maximum lengths of the publish names for the scoring function publish method are the following:
      • 19 alphanumeric characters for Teradata
      • 30 alphanumeric characters for Netezza, Greenplum, and DB2
      UNIX Specifics: The publish name (user-defined) portion of the function name in an AIX environment has a maximum length of 16 alphanumeric characters for Teradata.
    Note: The publish name for each model is reserved by default for subsequent use of the publishing models for a project.
  7. Enter a value for the database settings that appear for the selected database type and publish method.
    Here are the available database settings according to the database type:
    Database Settings
    Database Type
    Database server
    • Teradata
    • Netezza
    • Greenplum
    • DB2
    Database
    • Teradata
    • Netezza
    • Greenplum
    • DB2
    User ID
    • Teradata
    • Netezza
    • Greenplum
    • DB2
    Password
    • Teradata
    • Netezza
    • Greenplum
    • DB2
    Server user ID
    DB2
    Compile database
    Netezza
    Jazlib database
    Netezza
    Schema
    • Greenplum
    • DB2
    Initial wait time (in seconds)
    DB2
    FTP time out (in seconds)
    DB2
    Publish Models to Database window with populated content
  8. Click More Options. The <Database-type> Options window appears.
    Database Options window
    Select the check box for the desired validation options that appear for the selected database type:
    • Validate scoring results
    • Keep scoring function if validation fails
    • Display detailed log messages
    • Use model input
    • Protected mode (Teradata scoring function option) or Fenced mode (DB2 and Netezza scoring function option)
    Note: By default, the Validate scoring results and Use model input options are selected for both publish methods. The Protected mode or the Fenced mode options are selected by default for the scoring function publish method.
  9. Enter a numeric value for Sample Size. The default value for sample size is 100 if the value is null or zero. The maximum number of digits that are allowed is 8.
    Database Options window
  10. Click OK. A message is displayed to indicate whether the models were published to the database successfully or not.
    Publish Name Verification Message
    Note: The publish name portion of the Function Name is validated against the target database. The scoring function with the same publish name is replaced automatically.
  11. Click Close to complete the publishing process.
    The SAS score code (for example, Y120924043_Reg_1_Loan.sas), the scoring results log, and output for the scoring function are placed in the Publish Resultsthen select<prefix_publish-model-name_project-name> folder in the project file list, and the ScoringFunctionName and ScoringFunctionPrefix user-defined project properties are populated.
    User-Defined Project Properties
  12. The actions that are performed during the publishing process are displayed in the history. To view the history of the project, select the project name and then click the History tab in the Annotations - Loan pane.
  13. To view the publish history, select the Loan project, and select the Publish History tab. Select an item from the list to view the publish details
Note: After you have completed the publishing process, you can view the log file. The Publish Results folder in the Project Tree contains a folder for each model that was published. The publish name is used to create a folder for each model that is published. That folder contains the ScoringResults.log file. The time that the process started, details about who initiated the process, and the time when the project was published are recorded. Error messages are also recorded in the log file. The log file provides an audit trail of all relevant actions in the publishing process.
Publish Results for model