Overview of the SAS Scoring Accelerator for Greenplum

When using conventional processing to access data inside a Greenplum database, SAS Enterprise Miner asks the SAS/ACCESS engine for all rows of the table being processed. The SAS/ACCESS engine generates an SQL SELECT * statement that is passed to the Greenplum database. That SELECT statement fetches all the rows in the table, and the SAS/ACCESS engine returns them to SAS Enterprise Miner. As the number of rows in the table grows over time, network latency grows because the amount of data that is fetched from the Greenplum database to the SAS scoring process increases.
The SAS Scoring Accelerator for Greenplum embeds the robustness of SAS Enterprise Miner scoring models directly in the highly scalable Greenplum database. By using the SAS In-Database technology and the SAS Scoring Accelerator for Greenplum, the scoring processing is done inside the database, and thus does not require the transfer of data.
The SAS Scoring Accelerator for Greenplum takes the models that are developed by SAS Enterprise Miner and translates them into scoring functions that can be deployed inside Greenplum. After the scoring functions are published, the functions extend the Greenplum SQL language and can be used in SQL statements like other Greenplum functions.
The SAS Scoring Accelerator for Greenplum consists of two components:
  • the Score Code Export node in SAS Enterprise Miner. This extension exports the model scoring logic, including metadata about the required input and output variables, from SAS Enterprise Miner.
  • the publishing client that includes the %INDGP_PUBLISH_MODEL macro. This macro translates the scoring model into .c and .h files for creating the scoring functions and generates a script of Greenplum commands for registering the scoring functions. The publishing client then uses the SAS/ACCESS Interface to Greenplum to publish the scoring functions to Greenplum.