The SAS publishing macros
are used to publish the formats and the scoring functions in Greenplum.
The
%INDGP_PUBLISH_MODEL macro creates the files that are needed to build the
scoring functions and publishes the scoring functions with those files
to a specified database in Greenplum. Only the EM_ output variables
are published as Greenplum scoring functions.
For more information
about the EM_ output variables, see Fixed Variable Names.
The
%INDGP_PUBLISH_MODEL macro uses some of the files that are created by the
SAS Enterprise Miner Score Code Export node: the scoring model program
(score.sas file), the properties file (score.xml file), and (if the
training data includes SAS user-defined formats) a format catalog.
The
%INDGP_PUBLISH_MODEL macro performs the following tasks:
-
takes the score.sas and score.xml
files and produces the set of .c and .h files. These .c and .h files
are necessary to build separate scoring functions for each of a fixed
set of quantities that can be computed by the scoring model code.
-
if a format catalog is available,
processes the format catalog and creates an .h file with C structures,
which are also necessary to build the scoring functions.
-
produces a script of the Greenplum
commands that are used to register the scoring functions in the Greenplum
database.
-
transfers the .c and .h files to
Greenplum.
-
calls the
SAS_COMPILEUDF function
to compile the source files into object files and links to the SAS
formats library.
-
calls the SAS_COPYUDF function
to copy the new object files to
full-path-to-pkglibdir/SAS
on
the whole database array (master and all segments) , where
full-path-to-pkglibdir
is
the path that was defined during installation.
-
uses the
SAS/ACCESS Interface to
Greenplum to run the script to create the scoring functions with the
object files.
The scoring functions
are registered in Greenplum with shared object files, which are loaded
at run time. These functions are stored in a permanent location. The
SAS object files and the SAS formats library are stored in the
full-path-to-pkglibdir/SAS
directory
on all nodes, where
full-path-to-pkglibdir
is the path that
was defined during installation.
Greenplum caches the
object files within a session.
Note: You can publish scoring model
files with the same model name in multiple databases and schemas.
Because all model object files for the SAS scoring function are stored
in the
full-path-to-pkglibdir/SAS
directory,
the publishing macros use the database, schema, and model name as
the object filename to avoid potential naming conflicts.