In-Database Deployment Package for Greenplum

Prerequisites

SAS Foundation and the SAS/ACCESS Interface to Greenplum must be installed before you install and configure the in-database deployment package for Greenplum.
The SAS Scoring Accelerator for Greenplum requires a certain version of the Greenplum client and server environment. For more information, see http://www.sas.com/technologies/analytics/datamining/scoring_acceleration/#section=5.
Note: Starting with the August 2012 release, SAS is not compatible with Greenplum version 4.0 or older when publishing formats or running scoring models. If you use the second maintenance release of SAS 9.3, you must use Greenplum version 4.2.2 or later and Greenplum Partner Connector (GPPC) version 1.1 or later.

Overview of the In-Database Deployment Package for Greenplum

This section describes how to install and configure the in-database deployment package for Greenplum (SAS Formats Library for Greenplum 2.3 and SAS Embedded Process 9.33).
The in-database deployment package for Greenplum must be installed and configured before you can perform the following tasks:
  • Use the %INDGP_PUBLISH_FORMATS format publishing macro to create or publish the SAS_PUT( ) function and to create or publish user-defined formats as format functions inside the database.
  • Use the %INDGP_PUBLISH_MODEL scoring publishing macro to create scoring files or functions inside the database.
The format and scoring publishing macros are included in the SAS/ACCESS Interface to Greenplum. For more information about using the format and scoring publishing macros, see the SAS In-Database Products: User's Guide.
The in-database deployment package for Greenplum contains the SAS formats library and precompiled binary files for the %INDGP_PUBLISH_COMPILEUDF macro. Starting in August 2012, the package also contains the SAS Embedded Process.
The SAS formats library is a run-time library that is installed on your Greenplum system. This installation is done so that the SAS scoring model functions and the SAS_PUT( ) function created in Greenplum can access the routines within the run-time library. The SAS formats library contains the formats that are supplied by SAS.
The %INDGP_PUBLISH_COMPILEUDF macro registers utility functions in the database. The utility functions are called by the format and scoring publishing macros: %INDGP_PUBLISH_FORMATS and %INDGP_PUBLISH_MODEL. You must run the %INDGP_PUBLISH_COMPILEUDF macro before you run the format and scoring publishing macros.
The SAS Embedded Process is a SAS server process that runs within Greenplum to read and write data. The SAS Embedded Process contains the %INDGP_PUBLISH_COMPILEUDF_EP macro, run-time libraries, and other software that is installed on your Greenplum system. The %INDGP_PUBLISH_COMPILEUDF_EP macro defines the SAS_EP table function to the Greenplum database. You use the SAS_EP table function to produce scoring models after you run the %INDGP_PUBLISH_MODEL macro to create the SAS scoring files. The SAS Embedded Process accesses the SAS scoring files when a scoring operation is performed.

Greenplum Installation and Configuration Steps

  1. If you are upgrading from or reinstalling a previous release, follow the instructions in Upgrading from or Reinstalling a Previous Version before installing the in-database deployment package.
  2. Install the SAS formats library, the binary files, and the SAS Embedded Process.
  3. Run the %INDGP_PUBLISH_COMPILEUDF macro if you want to publish formats or use scoring functions to run a scoring model. Run the %INDGP_PUBLISH_COMPILEUDF_EP macro if you want to use the SAS Embedded Process to run a scoring model.

Upgrading from or Reinstalling a Previous Version

Upgrading or Reinstalling the SAS Formats Library and the SAS Embedded Process

To upgrade from or reinstall a previous version, follow these steps.
  1. Delete the SAS directory that contains the SAS Formats Library and the SAS Embedded Process.
    The directory is found here.
    path-from-pg_config --pkglibdir/SAS/
    path-from-pg_config --pkglibdir/SAS/SASTKInDatabaseServerForGreenplum
    CAUTION:
    If you delete the SAS directory, all the scoring models that you published using scoring functions and all user-defined formats that you published are deleted.
    If you previously published scoring models using scoring functions or if you previously published user-defined formats, you have to republish your scoring models and formats. If you used the SAS Embedded Process to publish scoring models, the scoring models are not deleted.
    It is a best practice to delete the SAS directory when you upgrade from or reinstall a previous version. Doing so ensures that you get the latest version of both the SAS Formats Library and the SAS Embedded Process. However, it is possible to delete the SAS Formats Library and the SAS Embedded Process separately. For more information, see either Upgrading or Reinstalling the SAS Formats Library or Upgrading or Reinstalling the SAS Embedded Process.

Upgrading or Reinstalling the SAS Formats Library

This topic has instructions for upgrading from or reinstalling only the SAS Formats Library. See Upgrading or Reinstalling the SAS Formats Library and the SAS Embedded Process for instructions for upgrading or reinstalling both the SAS Formats Library and the SAS Embedded Process at the same time. See Upgrading or Reinstalling the SAS Embedded Process for instructions on upgrading or reinstalling only the SAS Embedded Process.
To upgrade or reinstall only the SAS Formats Library, follow these steps. If you upgrade or install the SAS Formats Library in this manner, you do not delete any scoring models or formats that were previously published.
  1. Log in to the Greenplum master node as a superuser.
  2. Move to the directory where the SAS Formats Library is installed. The directory path is path-from-pg_config --pkglibdir/SAS/
  3. Delete the libjazxfbrs.so and sas_compileudf.so files.
  4. In addition to deleting the directory on the master node, you must log on to each host node and delete the directory on these nodes.
  5. Continue the installation instructions in Moving and Installing the SAS Formats Library and Binary Files.

Upgrading or Reinstalling the SAS Embedded Process

This topic has instructions for upgrading from or reinstalling only the SAS Embedded Process. See Upgrading or Reinstalling the SAS Formats Library and the SAS Embedded Process for instructions for upgrading or reinstalling both the SAS Formats Library and the SAS Embedded Process at the same time. See Upgrading or Reinstalling the SAS Formats Library for instructions on upgrading or reinstalling only the SAS Formats Library.
To upgrade or reinstall only the SAS Embedded Process, follow these steps. If you upgrade or install the SAS Embedded Process in this manner, you do not delete any scoring models that were previously published using the SAS Embedded Process.
  1. Log in to the Greenplum master node as a superuser.
  2. Delete the directory where the SAS Embedded Process is installed. The directory path is path-from-pg_config --pkglibdir/SAS/SASTKInDatabaseServerForGreenplum.
  3. In addition to deleting the directory on the master node, you must log on to each host node and delete the directory on these nodes.
  4. Continue the installation instructions in Moving and Installing the SAS Embedded Process.

Installing the SAS Formats Library, Binary Files, and SAS Embedded Process

Moving and Installing the SAS Formats Library and Binary Files

The SAS formats library and the binary files for the publishing macros are contained in a self-extracting archive file. The self-extracting archive file is located in the SAS-install-directory/SASFormatsLibraryforGreenplum/2.3/GreenplumonLinux64/ directory.
To move and unpack the self-extracting archive file, follow these steps:
  1. Using a method of your choice, transfer the accelgplmfmt-2.3-n_lax.sh file to your Greenplum master node. n is a number that indicates the latest version of the file. If this is the initial installation, n has a value of 1. Each time you reinstall or upgrade, n is incremented by 1.
    The file does not have to be downloaded to a specific location. However, you need to note where it is downloaded so that it can be executed at a later time.
  2. After the accelgplmfmt-2.3-n_lax.sh has been transferred, log on to the Greenplum master node as a superuser.
  3. Move to the directory where the self-extracting archive file was downloaded.
  4. Use the following command at the UNIX prompt to unpack the self-extracting archive file.
    ./accelgplmfmt-2.3-n_lax.sh
    Note: If you receive a “permissions denied” message, check the permissions on the accelgplmfmt-2.3-n_lax.sh file. This file must have EXECUTE permissions to run.
    After the script runs and the files are unpacked, the content of the target directories should look similar to these where path_to_sh_file is the location to which you copied the self-extracting archive file.
    /path_to_sh_file/SAS/SASFormatsLibraryForGreenplum/2.3-1/bin/
        InstallAccelGplmFmt.sh
    /path_to_sh_file/SAS/SASFormatsLibraryForGreenplum/2.3-1/bin/
        CopySASFiles.sh 
    /path_to_sh_file/SAS/SASFormatsLibraryForGreenplum/2.3-1/lib/
        SAS_CompileUDF.so
    /path_to_sh_file/SAS/SASFormatsLibraryForGreenplum/2.3-1/lib/
        libjazxfbrs.so 
  5. Use the following command to place the files in Greenplum:
    ./path_to_sh_file/SAS/SASFormatsLibraryForGreenplum/2.3-1/bin/
        CopySASFiles.sh
    All the SAS object files are stored under full-path-to-pkglibdir/SAS. The files are copied to the master node and each of the segment nodes. This command replaces all previous versions of the libjazxfbrs.so file.
    Note: You can use the following command to determine the full-path-to-pkglibdir directory:
    $ pg_config --pkglibdir
    The pg_config --pkglibdir command must be run by the person who performed the Greenplum install.
    Note: If you add new nodes at a later date, you must copy all the binary files to the new nodes. For more information, see Step 6.
  6. (Optional)If you add new nodes to the Greenplum master node after the initial installation of the SAS formats library and publishing macro, you must copy all the binaries in the full-path-to-pkglibdir/SAS directory to the new nodes using a method of your choice such as scp /SAS. The binary files include SAS_CompileUDF.so, libjazxfbrs.so, and the binary files for the already published functions.

Moving and Installing the SAS Embedded Process

The SAS Embedded Process is contained in a self-extracting archive file. The self-extracting archive file is located in the SAS-install-directory/SASTKInDatabaseServer/9.31/GreenplumonLinux64 directory.
To move and unpack the self-extracting archive file, follow these steps:
  1. Using a method of your choice, transfer the tkindbsrv-9.33-n_lax.sh file to your Greenplum master node. n is a number that indicates the latest version of the file. If this is the initial installation, n has a value of 1. Each time you reinstall or upgrade, n is incremented by 1.
    The file does not have to be downloaded to a specific location. However, you need to note where it is downloaded so that it can be executed at a later time.
  2. After the tkindbsrv-9.33-n_lax.sh has been transferred, log in to the Greenplum master node as a superuser.
  3. Move to the directory where the self-extracting archive file was downloaded.
  4. Use the following command at the UNIX prompt to unpack the self-extracting archive file.
    ./tkindbsrv-9.33-n_lax.sh
    Note: If you receive a “permissions denied” message, check the permissions on the tkinsbsrv-9.33-n_lax.sh file. This file must have EXECUTE permissions to run.
    After the script runs and the files are unpacked, the contents of the target directories should look similar to these. path_to_sh_file is the location to which you copied the self-extracting archive file in Step 1.
    /path_to_sh_file/InstallSASEPFiles.sh
    /path_to_sh_file/StartupSASEP.sh
    /path_to_sh_file/ShutdownSASEP.sh
    /path_to_sh_file/SAS/SASTKInDatabaseServerForGreenplum/9.33/admin
    /path_to_sh_file//SAS/SASTKInDatabaseServerForGreenplum/9.33/bin
    /path_to_sh_file//SAS/SASTKInDatabaseServerForGreenplum/9.33/logs
    /path_to_sh_file//SAS/SASTKInDatabaseServerForGreenplum/9.33/misc
    /path_to_sh_file//SAS/SASTKInDatabaseServerForGreenplum/9.33/sasexe
    /path_to_sh_file//SAS/SASTKInDatabaseServerForGreenplum/9.33/utilities
    Note: In addition to the /path_to_sh_file/ directory, the InstallSASEPFiles.sh, StartupSASEP.sh, and ShutdownSASEP.sh files are also placed in the /path_to_sh_file/SAS/SASTKInDatabaseServerForGreenplum/9.33/admin directory.
    The InstallSASEPFiles.sh file installs the SAS Embedded Process. The next step explains how to run this file. The StartupSASEP.sh and ShutdownSASEP.sh files enable you to manually start and stop the SAS Embedded Process. For more information about running these two files, see Controlling the SAS Embedded Process.
  5. Use the following commands at the UNIX prompt to install the SAS Embedded Process on the master node.
    The InstallSASEPFiles.sh file must be run from the /path_to_sh_file/ directory.
    cd /path_to_sh_file/
    ./InstallSASEPFiles.sh <-verbose>
    Note: -verbose is optional and enables you to see all messages generated during the installation process.
    The installation deploys the SAS Embedded Process to all the host nodes automatically.
    The installation also creates a path-from-pg_config --pkglibdir/SAS directory. This directory is created on the master node and each host node.
    The installation also copies the SAS directories and files from Step 1 across every node.
    The contents of the path-from-pg_config --pkglibdir/SAS/SASTKInDatabaseServerForGreenplum directory should look similar to these.
    path-from-pg_config --pkglibdir/SAS/SASTKInDatabaseServerForGreenplum/
       9.33/admin
    path-from-pg_config --pkglibdir/SAS/SASTKInDatabaseServerForGreenplum/
       9.33/bin
    path-from-pg_config --pkglibdir/SAS/SASTKInDatabaseServerForGreenplum/
       9.33/logs
    path-from-pg_config --pkglibdir/SAS/SASTKInDatabaseServerForGreenplum/
       9.33/misc
    path-from-pg_config --pkglibdir/SAS/SASTKInDatabaseServerForGreenplum/
       9.33/sasexe
    path-from-pg_config --pkglibdir/SAS/SASTKInDatabaseServerForGreenplum/
       9.33/utilities
    Use the following command to verify that the directory is created.
    $ pg_config --pkglibdir
    This is an example of a SAS directory.
    usr/local/greenplum-db-4.2.2.0/lib/postgresql/SAS

Running the %INDGP_PUBLISH_COMPILEUDF Macro

Overview of the %INDGP_PUBLISH_COMPILEUDF Macro

Use the %INDGP_PUBLISH_COMPILEUDF macro if you want to use scoring functions to run scoring models.
Note: Use the %INDGP_PUBLISH_COMPILEUDF_EP macro if you want to use the SAS Embedded Process to run scoring models. For more information, see Running the %INDGP_PUBLISH_COMPILEUDF_EP Macro.
The %INDGP_PUBLISH_COMPILEUDF macro publishes the following functions to the SASLIB schema in a Greenplum database:
  • SAS_COMPILEUDF function
    This function facilitates the %INDGP_PUBLISH_FORMATS format publishing macro and the %INDGP_PUBLISH_MODEL scoring publishing macro. The SAS_COMPILEUDF function performs the following tasks:
    • compiles the format and scoring model source files into object files. This compilation occurs through the SQL interface using an appropriate compiler for the system.
    • links with the SAS formats library.
    • copies the object files to the full-path-to-pkglibdir/SAS directory. All the SAS object files are stored under full-path-to-pkglibdir/SAS. You can use the pg_config --pkglibdir command to determine the full-path-to-pkglibdir directory.
  • Three utility functions that are used when the scoring publishing macro transfers source files from the client to the host:
    • SAS_COPYUDF function
      This function copies the shared libraries to the full-path-to-pkglibdir/SAS path on the whole database array including the master and all segments.
    • SAS_DIRECTORYUDF function
      This function creates and removes a temporary directory that holds the source files on the server.
    • SAS_DEHEXUDF function
      This function converts the files from hexadecimal back to text after the files are exported on the host.
You have to run the %INDGP_PUBLISH_COMPILEUDF macro only one time in each database.
Note: The SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF functions must be published before you run the %INDGP_PUBLISH_FORMATS or the %INDGP_PUBLISH_MODEL macro. Otherwise, these macros fail.
Note: To publish the SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF functions, you must have superuser permissions to create and execute these functions in the SASLIB schema and in the specified database.

%INDGP_PUBLISH_COMPILEUDF Macro Run Process

To run the %INDGP_PUBLISH_COMPILEUDF macro, follow these steps:
Note: To publish the SAS_COMPILEUDF function, you must have superuser permissions to create and execute this function in the SASLIB schema and in the specified database.
  1. Create a SASLIB schema in the database where the SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF functions are published.
    You must use “SASLIB” as the schema name for Greenplum in-database processing to work correctly.
    You specify that database in the DATABASE argument of the %INDGP_PUBLISH_COMPILEUDF macro. For more information, see %INDGP_PUBLISH_COMPILEUDF Macro Syntax.
    The SASLIB schema contains the SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF functions.
  2. Start SAS 9.3 and submit the following commands in the Enhanced Editor or Program Editor:
    %indgppc;
    %let indconn = user=youruserid password=yourpwd dsn=yourdsn;
    /* You can use server=yourserver database=yourdb instead of dsn=yourdsn */
    
    For more information, see %INDGPPC Macro and INDCONN Macro Variable.
  3. Run the %INDGP_PUBLISH_COMPILEUDF macro.
    For more information, see %INDGP_PUBLISH_COMPILEUDF Macro Syntax.
You can verify that the SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF functions have been published successfully. For more information, see Validating the Publishing of the SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF Functions.

%INDGPPC Macro

The %INDGPPC macro is an autocall library that initializes the %INDGP_PUBLISH_COMPILEUDF macro.

INDCONN Macro Variable

The INDCONN macro variable provides the credentials to make a connection to Greenplum. You must specify the user, password, and either the DSN or server and database information to access the machine on which you have installed the Greenplum database. You must assign the INDCONN macro variable before the %INDGP_PUBLISH_COMPILEUDF macro is invoked.
The value of the INDCONN macro variable for the %INDGP_PUBLISH_COMPILEUDF macro has one of these formats:
USER=<'>userid<'> PASSWORD=<'>password<'> DSN=<'>dsnname<'>
USER=<'>userid<'> PASSWORD=<'>password<'> SERVER=<'>server<'>
DATABASE=<'>database<'>

USER=<'>userid<'>

specifies the Greenplum user name (also called the user ID) that is used to connect to the database. If the user name contains spaces or nonalphanumeric characters, enclose the user name in quotation marks.

PASSWORD=<'>password<'>

specifies the password that is associated with your Greenplum user ID. If the password contains spaces or nonalphabetic characters, enclose the password in quotation marks.

Tip You can use only PASSWORD=, PASS=, or PW= for the password argument. PWD= is not supported and causes an error.

DSN=<'>datasource<'>

specifies the configured Greenplum ODBC data source to which you want to connect. If the DSN name contains spaces or nonalphabetic characters, enclose the DSN name in quotation marks.

Requirement You must specify either the DSN= argument or the SERVER= and DATABASE= arguments in the INDCONN macro variable.

SERVER=<'>server<'>

specifies the Greenplum server name or the IP address of the server host. If the server name contains spaces or nonalphanumeric characters, enclose the server name in quotation marks.

Requirement You must specify either the DSN= argument or the SERVER= and DATABASE= arguments in the INDCONN macro variable.

DATABASE=<'>database<'>

specifies the Greenplum database that contains the tables and views that you want to access. If the database name contains spaces or nonalphanumeric characters, enclose the database name in quotation marks.

Requirement You must specify either the DSN= argument or the SERVER= and DATABASE= arguments in the INDCONN macro variable.

Note: The SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF functions are published to the SASLIB schema in the specified database. The SASLIB schema must be created before publishing the SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF functions.

%INDGP_PUBLISH_COMPILEUDF Macro Syntax

%INDGP_PUBLISH_COMPILEUDF
(OBJPATH=full-path-to-pkglibdir/SAS
<, DATABASE=database-name>
<, ACTION=CREATE | REPLACE | DROP>
<, OUTDIR=diagnostic-output-directory>
);
Arguments

OBJPATH=full-path-to-pkglibdir/SAS

specifies the parent directory where all the object files are stored.

Tip The full-path-to-pkglibdir directory was created during installation of the self-extracting archive file. You can use the pg_config --pkglibdir command to determine the name of the full-path-to-pkglibdir directory.

DATABASE=database-name

specifies the name of a Greenplum database to which the SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF functions are published.

Restriction If you specify DSN= in the INDCONN macro variable, do not use the DATABASE argument.

ACTION=CREATE | REPLACE | DROP

specifies that the macro performs one of the following actions:

CREATE

creates a new SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF function.

REPLACE

overwrites the current SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF functions, if a function by the same name is already registered, or creates a new SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF function if one is not registered.

Requirement If you are upgrading from or reinstalling the SAS Formats Library, run the %INDGP_PUBLISH_COMPILEUDF macro with ACTION=REPLACE. The CopySASFiles.sh install script replaces existing versions of most files. However, you need to replace the existing SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF functions after you run the CopySASFiles.sh install script. For more information, see Upgrading from or Reinstalling a Previous Version and Moving and Installing the SAS Formats Library and Binary Files.

DROP

causes the SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF functions to be dropped from the Greenplum database.

Default CREATE
Tip If the SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF functions were published previously and you specify ACTION=CREATE, you receive warning messages that the functions already exist and you are prompted to use REPLACE. If the SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF functions were published previously and you specify ACTION=REPLACE, no warnings are issued.

OUTDIR=output-directory

specifies a directory that contains diagnostic files.

Tip Files that are produced include an event log that contains detailed information about the success or failure of the publishing process.

Running the %INDGP_PUBLISH_COMPILEUDF_EP Macro

Overview of the %INDGP_PUBLISH_COMPILEUDF_EP Macro

Use the %INDGP_PUBLISH_COMPILEUDF_EP macro if you want to use the SAS Embedded Process to run scoring models.
Note: Use the %INDGP_PUBLISH_COMPILEUDF macro if you want to use scoring functions to run scoring models. For more information, see Running the %INDGP_PUBLISH_COMPILEUDF Macro.
The %INDGP_PUBLISH_COMPILEUDF_EP macro registers the SAS_EP table function in the database.
You have to run the %INDGP_PUBLISH_COMPILEUDF_EP macro only one time in each database where scoring models are published.
The %INDGP_PUBLISH_COMPILEUDF_EP macro must be run before you use the SAS_EP function in an SQL query.
Note: To publish the SAS_EP function, you must have superuser permissions to create and execute this function in the specified schema and database.

%INDGP_PUBLISH_COMPILEUDF_EP Macro Run Process

To run the %INDGP_PUBLISH_COMPILEUDF_EP macro, follow these steps:
Note: To publish the SAS_EP function, you must have superuser permissions to create and execute this function in the specified schema and database.
  1. Create a schema in the database where the SAS_EP function is published.
    You specify the schema and database in the INDCONN macro variable. For more information, see INDCONN Macro Variable.
  2. Start SAS 9.3 and submit the following commands in the Enhanced Editor or Program Editor:
    %indgppc;
    %let indconn = user=youruserid password=yourpwd dsn=yourdsn <schema=yourschema>;
    /* You can use server=yourserver database=yourdb instead of dsn=yourdsn */
    
    For more information, see %INDGPPC Macro and INDCONN Macro Variable.
  3. Run the %INDGP_PUBLISH_COMPILEUDF_EP macro. For more information, see %INDGP_PUBLISH_COMPILEUDF_EP Macro Syntax.
You can verify that the SAS_EP function has been published successfully. For more information, see Validating the Publishing of the SAS_EP Function.

%INDGPPC Macro

The %INDGPPC macro is an autocall library that initializes the %INDGP_PUBLISH_COMPILEUDF_EP macro.

INDCONN Macro Variable

The INDCONN macro variable provides the credentials to make a connection to Greenplum. You must specify the user, password, and either the DSN or server and database information to access the machine on which you have installed the Greenplum database. You must assign the INDCONN macro variable before the %INDGP_PUBLISH_COMPILEUDF_EP macro is invoked.
The value of the INDCONN macro variable for the %INDGP_PUBLISH_COMPILEUDF_EP macro has one of these formats:
USER=<'>userid<'> PASSWORD=<'>password<'> DSN=<'>dsnname <'>
<SCHEMA=<'>schema<'>>
USER=<'>userid<'> PASSWORD=<'>password<'> SERVER=<'>server<'>
DATABASE=<'>database<'> <SCHEMA=<'>schema<'>>

USER=<'>userid<'>

specifies the Greenplum user name (also called the user ID) that is used to connect to the database. If the user name contains spaces or nonalphanumeric characters, enclose the user name in quotation marks.

PASSWORD=<'>password<'>

specifies the password that is associated with your Greenplum user ID. If the password contains spaces or nonalphabetic characters, enclose the password in quotation marks.

Tip You can use only PASSWORD=, PASS=, or PW= for the password argument. PWD= is not supported and causes an error.

DSN=<'>datasource<'>

specifies the configured Greenplum ODBC data source to which you want to connect. If the DSN name contains spaces or nonalphabetic characters, enclose the DSN name in quotation marks.

Requirement You must specify either the DSN= argument or the SERVER= and DATABASE= arguments in the INDCONN macro variable.

SERVER=<'>server<'>

specifies the Greenplum server name or the IP address of the server host. If the server name contains spaces or nonalphanumeric characters, enclose the server name in quotation marks.

Requirement You must specify either the DSN= argument or the SERVER= and DATABASE= arguments in the INDCONN macro variable.

DATABASE=<'>database<'>

specifies the Greenplum database that contains the tables and views that you want to access. If the database name contains spaces or nonalphanumeric characters, enclose the database name in quotation marks.

Requirement You must specify either the DSN= argument or the SERVER= and DATABASE= arguments in the INDCONN macro variable.

SCHEMA=<'>schema<'>

specifies the name of the schema where the SAS_EP function is defined.

Default SASLIB
Requirement You must create the schema in the database before you run the %INDB_PUBLISH_COMPILEUDF_EP macro.

%INDGP_PUBLISH_COMPILEUDF_EP Macro Syntax

%INDGP_PUBLISH_COMPILEUDF_EP
(<OBJPATH=full-path-to-pkglibdir/SAS>
<, DATABASE=database-name>
<, ACTION=CREATE | REPLACE | DROP>
<, OUTDIR=diagnostic-output-directory>
);
Arguments

OBJPATH=full-path-to-pkglibdir/SAS

specifies the parent directory where all the object files are stored.

Tip The full-path-to-pkglibdir directory was created during installation of the InstallSASEP.sh self-extracting archive file. You can use the pg_config --pkglibdir command to determine the name of the full-path-to-pkglibdir directory.

DATABASE=database-name

specifies the name of a Greenplum database where the SAS_EP function is defined.

Restriction If you specify DSN= in the INDCONN macro variable, do not use the DATABASE argument.

ACTION=CREATE | REPLACE | DROP

specifies that the macro performs one of the following actions:

CREATE

creates a new SAS_EP function.

REPLACE

overwrites the current SAS_EP function, if a function by the same name is already registered, or creates a new SAS_EP function if one is not registered.

Requirement If you are upgrading from or reinstalling the SAS Embedded Process, run the %INDGP_PUBLISH_COMPILEUDF_EP macro with ACTION=REPLACE. The InstallSASEPFiles.sh install script replaces existing versions of most files. However, you need to replace the existing SAS_EP function after you run the InstallSASEPFiles.sh install script. For more information, see Upgrading from or Reinstalling a Previous Version and Moving and Installing the SAS Embedded Process.

DROP

causes the SAS_EP function to be dropped from the Greenplum database.

Default CREATE
Tip If the SAS_EP function was defined previously and you specify ACTION=CREATE, you receive warning messages that the functions already exist and you are prompted to use REPLACE. If the SAS_EP function was defined previously and you specify ACTION=REPLACE, no warnings are issued.

OUTDIR=output-directory

specifies a directory that contains diagnostic files.

Tip Files that are produced include an event log that contains detailed information about the success or failure of the publishing process.

Validating the Publishing of the SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF Functions

To validate that the SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, and SAS_DEHEXUDF functions are registered properly under the SASLIB schema in the specified database, follow these steps.
  1. Use psql to connect to the database.
    psql -d databasename
    You should receive the following prompt.
    databasename=#
  2. At the prompt, enter the following command.
    select prosrc from pg_proc f, pg_namespace s where f.pronamespace=s.oid 
        and upper(s.nspname)='SASLIB';
    You should receive a result similar to the following:
    SAS_CompileUDF
    SAS_CopyUDF
    SAS_DirectoryUDF
    SAS_DehexUDF

Validating the Publishing of the SAS_EP Function

To validate that the SAS_EP function is registered properly under the specified schema in the specified database, follow these steps.
  1. Use psql to connect to the database.
    psql -d databasename
    You should receive the following prompt.
    databasename=#
  2. At the prompt, enter the following command.
    select prosrc, probin from pg_catalog.pg_proc where proname = 'sas_ep';
    You should receive a result similar to the following:
    SAS_EP | $libdir/SAS/sasep_tablefunc.so

Controlling the SAS Embedded Process

The SAS Embedded Process starts when a query is submitted using the SAS_EP function. It continues to run until it is manually stopped or the database is shut down.
Note: Starting and stopping the SAS Embedded Process has implications for all scoring model publishers.
Note: Manually starting and stopping the SAS Embedded Process requires superuser permissions and must be done from the Greenplum master node.
When the SAS Embedded Process is installed, the ShutdownSASEP.sh and StartupSASEP.sh scripts are installed in the following directory. For more information about these files, see Moving and Installing the SAS Embedded Process.
/path_to_sh_file/SAS/SASTKInDatabaseServerForGreenplum/9.33
Use the following command to shut down the SAS Embedded Process.
/path_to_sh_file/SAS/SASTKInDatabaseServerForGreenplum/9.33/ShutdownSASEP.sh
    <-verbose>
When invoked from the master node, ShutdownSASEP.sh shuts down the SAS Embedded Process on each database node. The “-verbose” option provides a status of the shutdown operations as they occur. This script should not be used as part of the normal operation. It is designed to be used to shut down the SAS Embedded Process prior to a database upgrade or re-install.
Use the following command to start the SAS Embedded Process.
/path_to_sh_file/SAS/SASTKInDatabaseServerForGreenplum/9.33/StartupSASEP.sh
    <-verbose>
When invoked from the master node, StartupSASEP.sh manually starts the SAS Embedded Process on each database node. The “-verbose” option provides a status of the start-up operations as they occur. This script should not be used as part of the normal operation. It is designed to be used to manually start the SAS Embedded Process and only after consultation with SAS Technical Support.

Greenplum Permissions

To publish the utility (SAS_COMPILEUDF, SAS_COPYUDF, SAS_DIRECTORYUDF, SAS_DEHEXUDF, SAS_EP), format, and scoring model functions, Greenplum requires that you have superuser permissions to create and execute these functions in the SASLIB (or other specified) schema and in the specified database.
In addition to Greenplum superuser permissions, you must have CREATE TABLE permission to create a model table when using the SAS Embedded Process.
If you plan to use SAS Model Manager with the SAS Scoring Accelerator for in-database scoring, additional permissions are required. For more information, see Configurations for SAS Model Manager.

Documentation for Publishing SAS Formats and Scoring Models in Greenplum

For information about how to publish SAS formats and scoring models, see the SAS In-Database Products: User's Guide, located at http://support.sas.com/documentation/onlinedoc/indbtech/index.html.