Running the %INDGP_PUBLISH_MODEL Macro

%INDGP_PUBLISH_MODEL Macro Run Process

To run the %INDGP_PUBLISH_MODEL macro, complete the following steps:
  1. Create a scoring model using SAS Enterprise Miner.
  2. Use the SAS Enterprise Miner Score Code Export node to create a score output directory. Populate the directory with the score.sas file, the score.xml file, and (if needed) the format catalog.
  3. Start SAS and submit one of the following commands in the Program Editor or Enhanced Editor:
    %let indconn = user=youruserid password=yourpwd
        dsn=yourdsn schema=yourschema;
    %let indconn = user=youruserid password=yourpwd server=yourserver
        database=yourdb schema=yourschema;
    
    For more information, see the INDCONN Macro Variable.
  4. Run the %INDGP_PUBLISH_MODEL macro. For more information, see %INDGP_PUBLISH_MODEL Macro Syntax.
    Messages are written to the SAS log that indicate the success or failure of the creation of the scoring files or functions.

INDCONN Macro Variable

The INDCONN macro variable is used to provide credentials to connect to Greenplum. You must specify user, password, either the DSN name or the server and database name. The schema name is optional. You must assign the INDCONN macro variable before the %INDGP_PUBLISH_MODEL macro is invoked.
The value of the INDCONN macro variable for the %INDGP_PUBLISH_MODEL macro has one of these formats:
USER=<'>username<'> PASSWORD=<'>password<'> DSN=<'>dsnname<'>
<SCHEMA=<'>schemaname<'>> <PORT=<'>port-number<'>>
USER=<'>username<'> PASSWORD=<'>password<'> SERVER=<'>servername<'>
DATABASE=<'>databasename<'> <SCHEMA=<'>schemaname<'>>
<PORT=<'>port-number<'>>
Arguments

USER=<'>username<'>

specifies the Greenplum user name (also called the user ID) that is used to connect to the database. If the user name contains spaces or nonalphanumeric characters, you must enclose the user name in quotation marks.

PASSWORD=<'>password<'>

specifies the password that is associated with your Greenplum user ID. If the password contains spaces or nonalphabetic characters, you must enclose he password in quotation marks.

Tip Use only PASSWORD=, PASS=, or PW= for the password argument. PWD= is not supported and causes an error.

DSN=<'>datasourcename<'>

specifies the configured Greenplum ODBC data source to which you want to connect. If the DSN contains spaces or nonalphabetic characters, you must enclose the DSN in quotation marks.

Requirement You must specify either the DSN= argument or the SERVER= and DATABASE= arguments in the INDCONN macro variable.

SERVER=<'>servername<'>

specifies the Greenplum server name or the IP address of the server host. If the server name contains spaces or nonalphanumeric characters, you must enclose the server name in quotation marks.

Requirement You must specify either the DSN= argument or the SERVER= and DATABASE= arguments in the INDCONN macro variable.

DATABASE=<'>databasename<'>

specifies the Greenplum database that contains the tables and views that you want to access. If the database name contains spaces or nonalphanumeric characters, you must enclose the database name in quotation marks.

Requirement You must specify either the DSN= argument or the SERVER= and DATABASE= arguments in the INDCONN macro variable.

SCHEMA=<'>schemaname<'>

specifies the schema name for the database.

Tip If you do not specify a value for the SCHEMA argument, the value of the USER argument is used as the schema name. The schema must be created by your database administrator.

PORT=<'>port-number<'>

specifies the psql port number.

Default 5432
Requirement The server-side installer uses psql, and psql default port is 5432. If you want to use another port, you must have the UNIX or database administrator change the psql port.
Tip
The INDCONN macro variable is not passed as an argument to the %INDGP_PUBLISH_MODEL macro. This information can be concealed in your SAS job. You might want to place it in an autoexec file and set the permissions on the file so that others cannot access the user ID and password.

%INDGP_PUBLISH_MODEL Macro Syntax

%INDGP_PUBLISH_MODEL
(DIR=input-directory-path, MODELNAME=name
<, MECHANISM=STATIC | EP>
<, MODELTABLE=model-table-name>
<, DATASTEP=score-program-filename>
<, XML=xml-filename>
<, DATABASE=database-name>
<, FMTCAT=format-catalog-filename>
<, ACTION=CREATE | REPLACE | DROP>
<, OUTDIR=diagnostic-output-directory>
);
Arguments

DIR=input-directory-path

specifies the directory where the scoring model program, the properties file, and the format catalog are located.

This is the directory that is created by the SAS Enterprise Miner Score Code Export node. This directory contains the score.sas file, the score.xml file, and (if user-defined formats were used) the format catalog.
Requirement You must use a fully qualified pathname.
Interaction If you do not use the default directory that is created by SAS Enterprise Miner, you must specify the DATASTEP=, XML=, and (if needed) FMTCAT= arguments.
See Special Characters in Directory Names

MODELNAME=name

specifies the name that is prepended to each output function to ensure that each scoring function name is unique in the Greenplum database.

Restriction The scoring function name is a combination of the model and output variable names. A scoring function name cannot exceed 63 characters. For more information, see Scoring Function Names.
Requirement The model name must be a valid SAS name that is 10 characters or fewer. For more information about valid SAS names, see the topic on rules for words and names in SAS Language Reference: Concepts.
Interaction Only the EM_ output variables are published as Greenplum scoring functions. For more information about the EM_ output variables, see Fixed Variable Names and Scoring Function Names.

MECHANISM=STATIC | EP

specifies whether scoring functions or scoring files are created. MECHANISM= can have one of the following values:

STATIC

specifies that scoring functions are created.

These scoring functions are used in an SQL query to run the scoring model.
See Using Scoring Functions to Run Scoring Models

EP

specifies that scoring files are created.

These scoring files are used by the SAS Embedded Process to run the scoring model. A single entry in the model table is inserted for each new model. The entry contains both the score.sas and score.xml files in separate columns. The scoring process includes reading these entries from the table and transferring them to each instance of the SAS Embedded Process for execution.
Requirement If you specify MECHANISM=EP, you must also specify the MODELTABLE= argument.
Note The SAS Embedded Process might require a later release of Greenplum than function-based scoring. For more information, see the SAS Foundation system requirements documentation for your operating environment.
See Using the SAS Embedded Process to Run Scoring Models
Default STATIC

MODELTABLE=model-table-name

specifies the name of the model table where the scoring files are published.

Default sas_model_table
Restriction This argument is valid only when using the SAS Embedded Process.
Requirement The name of the model table must be the same as the name specified in the %INDGP_CREATE_MODELTABLE macro. For more information, see the MODELTABLE argument in %INDGP_CREATE_MODELTABLE Macro Syntax.

DATASTEP=score-program-filename

specifies the name of the scoring model program file that was created by using the SAS Enterprise Miner Score Code Export node.

Default score.sas
Restriction Only DATA step programs that are produced by the SAS Enterprise Miner Score Code Export node can be used.
Interaction If you use the default score.sas file that is created by the SAS Enterprise Miner Score Code Export node, you do not need to specify the DATASTEP= argument.

XML=xml-filename

specifies the name of the properties XML file that was created by the SAS Enterprise Miner Score Code Export node.

Default score.xml
Restrictions Only XML files that are produced by the SAS Enterprise Miner Score Code Export node can be used.
If you use scoring functions to run scoring models, the maximum number of output variables is 128. If you use the SAS Embedded Process, the maximum is 1660.
Interaction If you use the default score.xml file that is created by the SAS Enterprise Miner Score Code Export node, you do not need to specify the XML= argument.

DATABASE=database-name

specifies the name of a Greenplum database to which the scoring functions and formats are published.

Restriction If you specify DSN= in the INDCONN macro variable, do not use the DATABASE argument.
Interaction The database that is specified by the DATABASE= argument takes precedence over the database that you specify in the INDCONN macro variable. For more information, see %INDGP_PUBLISH_MODEL Macro Run Process.

FMTCAT=format-catalog-filename

specifies the name of the format catalog file that contains all user-defined formats that were created by the FORMAT procedure and that are referenced in the DATA step scoring model program.

Restriction Only format catalog files that are produced by the SAS Enterprise Miner Score Code Export node can be used.
Interactions If you use the default format catalog that is created by the SAS Enterprise Miner Score Code Export node, you do not need to specify the FMTCAT= argument.
If you do not use the default catalog name (FORMATS) or the default library (WORK or LIBRARY) when you create user-defined formats, you must use the FMTSEARCH system option to specify the location of the format catalog. For more information, see PROC FORMAT in the Base SAS Procedures Guide.

ACTION=CREATE | REPLACE | DROP

specifies one of the following actions that the macro performs:

CREATE

creates a new function.

REPLACE

overwrites the current function, if a function by the same name is already registered.

DROP

causes all functions for this model to be dropped from the Greenplum database.

Default CREATE
Tip If the function has been previously defined and you specify ACTION=CREATE, you receive warning messages from Greenplum. If the function has been previously defined and you specify ACTION=REPLACE, no warnings are issued.

OUTDIR=diagnostic-output-directory

specifies a directory that contains diagnostic files.

Files that are produced include an event log that contains detailed information about the success or failure of the publishing process and sample SQL code (SampleSQL.txt). For more information about the SampleSQL.txt file, see Scoring Function Names.
Tip This argument is useful when testing your scoring models.
See Special Characters in Directory Names