Introduction to SAS In-Database Processing

When using conventional processing to access data inside a data source, SAS asks the SAS/ACCESS engine for all rows of the table being processed. The SAS/ACCESS engine generates an SQL SELECT * statement that is passed to the data source. That SELECT statement fetches all the rows in the table, and the SAS/ACCESS engine returns them to SAS. As the number of rows in the table grows over time, network latency grows because the amount of data that is fetched from the data source to SAS increases.

SAS in-database processing integrates SAS solutions, SAS analytic processes, and third-party data providers. Using SAS in-database processing, you can run scoring models, some SAS procedures, DS2 thread programs, and formatted SQL queries inside the data source.

Note: For the sake of brevity, in this document, the term database is used to denote both relational database management systems and file systems.

The following table lists the SAS products needed to use these features.

In-Database Feature	Software Required	Supported Data Providers
format publishing and the SAS_PUT( ) function	Base SAS SAS/ACCESS Interface to the data source	Aster DB2 under UNIX Greenplum Netezza Teradata
scoring models	Base SAS SAS/ACCESS Interface to the data source SAS Scoring Accelerator SAS Enterprise Miner SAS Factory Miner (analytic store scoring)1 SAS Scalable Performance Data Server (optional) SAS Model Manager (optional)	Aster DB2 under UNIX Greenplum Hadoop Netezza Oracle SAP HANA SPD Server Teradata
Base SAS procedures: FREQ 2RANK REPORT 2SORT SUMMARY/MEANS TABULATE 3TRANSPOSE	Base SAS SAS/ACCESS Interface to the data source SAS In-Database Code Accelerator (PROC TRANSPOSE only)	Aster DB2 (UNIX and z/OS) Greenplum Hadoop Hawq Impala Microsoft SQL Server Netezza Oracle PostgreSQL Redshift SAP HANA Teradata Vertica
SAS/STAT procedures: CORR CANCORR DMDB DMINE DMREG FACTOR PRINCOMP REG SCORE TIMESERIES VARCLUS	Base SAS (for CORR) SAS/ACCESS Interface to Teradata SAS/STAT (for CANCORR, FACTOR,PRINCOMP, REG, SCORE, VARCLUS) SAS/ETS (for TIMESERIES) SAS Enterprise Miner (for DMDB, DMINE, DMREG) SAS Analytics Accelerator	Teradata
DS2 threaded programs	Base SAS SAS/ACCESS Interface to the data source SAS In-Database Code Accelerator	Greenplum Hadoop Teradata
DATA step scoring programs	Base SAS SAS/ACCESS Interface to Hadoop	Hadoop
data quality operations	Base SAS SAS/ACCESS Interface to Hadoop SAS/ACCESS Interface to Teradata SAS In-Database Code Accelerator SAS Data Loader for Hadoop SAS Data Quality Accelerator for Teradata	Hadoop Teradata
extract and transform data	Base SAS SAS/ACCESS Interface to Hadoop SAS/ACCESS Interface to Teradata SAS Data Loader for Hadoop SAS Data Quality Accelerator for Teradata	Hadoop Teradata
1Analytic store scoring is supported only on Hadoop, SAP HANA, and Teradata.
2In-database processing of PROC RANK and PROC SORT is supported by Hadoop with Hive 0.13 and later.
3In-database processing of PROC TRANSPOSE is only on Hadoop and Teradata. Additional licensing and configuration is required. For more information, see Running PROC TRANSPOSE inside the Database .