The
SAS publishing macros are used to publish the formats and the scoring
functions in Greenplum.
The
%INDGP_PUBLISH_MODEL macro creates the files
that are needed to build the scoring functions and publishes the scoring
functions with those files to a specified database in Greenplum. Only
the EM_ output variables are published as Greenplum scoring functions.
For more information
about the EM_ output variables, see Fixed Variable Names.
The
%INDGP_PUBLISH_MODEL macro uses some of the
files that are created by the SAS Enterprise Miner Score Code Export
node: the scoring model program (score.sas file), the properties file
(score.xml file), and (if the training data includes SAS user-defined
formats) a format catalog.
The
%INDGP_PUBLISH_MODEL macro performs the following
tasks:
-
takes the score.sas and score.xml
files and produces the set of .c and .h files. These .c and .h files
are necessary to build separate scoring functions for each of a fixed
set of quantities that can be computed by the scoring model code.
-
if a format catalog is available,
processes the format catalog and creates an .h file with C structures,
which are also necessary to build the scoring functions.
-
produces a script of the Greenplum
commands that are used to register the scoring functions on the Greenplum
database.
-
transfers the .c and .h files to
Greenplum.
-
calls the
SAS_COMPILEUDF function
to compile the source files into object files and links to the SAS
formats library.
-
calls the SAS_COPYUDF function
to copy the new object files to
full-path-to-pkglibdir/SAS
on the whole database array (master and all
segments) , where
full-path-to-pkglibdir
is the path that was defined during installation.
-
uses the
SAS/ACCESS Interface to
Greenplum to run the script to create the scoring functions with the
object files.
The scoring functions
are registered in Greenplum with shared object files, which are loaded
at run time. These functions are stored in a permanent location. The
SAS object files and the SAS formats library are stored in the
full-path-to-pkglibdir/SAS
directory
on all nodes, where
full-path-to-pkglibdir
is the path that was defined during installation.
Greenplum caches the
object files within a session.
Note: You can publish scoring model
files with the same model name in multiple databases and schemas.
Because all model object files for the SAS scoring function are stored
in the
full-path-to-pkglibdir/SAS
directory,
the publishing macros use the database, schema, and model name as
the object filename to avoid potential naming conflicts.