Maximizing Teradata Read Performance

Overview

A major objective of SAS/ACCESS when you are reading DBMS tables is to take advantage of the Teradata rate of data transfer. The DBINDEX=, SPOOL=SPOOL= LIBNAME Option, and PREFETCH= options can help you achieve optimal read performance. This section provides detailed information about PREFETCH as a LIBNAME option and PREFETCH as a global option.

Using the PreFetch Facility

Overview

PreFetch is a SAS/ACCESS Interface to Teradata facility that speeds up a SAS job by exploiting the parallel processing capability of Teradata. To obtain benefit from the facility, your SAS job must run more than once and have these characteristics:
  • use SAS/ACCESS to query Teradata DBMS tables
  • should not contain SAS statements that create, update, or delete Teradata DBMS tables
  • run SAS code that changes infrequently or not at all.
In brief, the ideal job is a stable read-only SAS job.
Use of PreFetch is optional. To use the facility, you must explicitly enable it with the PREFETCH= LIBNAME option.

How PreFetch Works

When reading DBMS tables,SAS/ACCESS submits SQL statements on your behalf to Teradata. Each SQL statement that is submitted has an execution cost: the amount of time Teradata spends processing the statement before it returns the requested data to SAS/ACCESS.
When PreFetch is enabled the first time you run your SAS job, SAS/ACCESS identifies and selects statements with a high execution cost. SAS/ACCESS then stores (caches) the selected SQL statements to one or more Teradata macros that it creates.
On subsequent runs of the job, when PreFetch is enabled, SAS/ACCESS extracts statements from the cache and submits them to Teradata in advance. The rows that these SQL statements select are immediately available to SAS/ACCESS because Teradata prefetches them. Your SAS job runs faster because PreFetch reduces the wait for SQL statements with a high execution cost. However, PreFetch improves elapsed time only on subsequent runs of a SAS job. During the first run, SAS/ACCESS only creates the SQL cache and stores selected SQL statements; no prefetching is performed.

PreFetch Option Arguments

unique_storename
As mentioned, when PreFetch is enabled, SAS/ACCESS creates one or more Teradata macros to store the selected SQL statements that PreFetch caches. You can easily distinguish a PreFetch macro from other Teradata macros. The PreFetch Teradata macro contains a comment that is prefaced with this text:
"SAS/ACCESS PreFetch Cache"
The name that the PreFetch facility assigns for the macro is the value that you enter for the unique_storename argument. The unique_storename must be unique. Do not specify a name that exists in the Teradata DBMS already for a DBMS table, view, or macro. Also, do not enter a name that exists already in another SAS job that uses the Prefetch facility.
#sessions
This argument specifies how many cached SQL statements SAS/ACCESS submits in parallel to Teradata. In general, your SAS job completes faster if you increase the number of statements that Teradata works on in advance. However, a large number (too many sessions) can strain client and server resources. A valid value is 1 through 9. If you do not specify a value for this argument, the default is 3.
In addition to the specified number of sessions, SAS/ACCESS adds an additional session for submitting SQL statements that are not stored in the PreFetch cache. Therefore, if the default is 3, SAS/ACCESS actually opens up to four sessions on the Teradata server.
algorithm
This argument is present to handle future enhancements. Currently PreFetch only supports one algorithm, SEQUENTIAL.

When and Why Use PreFetch

If you have a read-only SAS job that runs frequently, this is an ideal candidate for PreFetch. An example is a daily job that extracts data from Teradata tables. To help you decide when to use PreFetch, consider these daily jobs:
  • Job 1
    Reads and collects data from the Teradata DBMS.
  • Job 2
    Contains a WHERE clause that reads in values from an external, variable data source. As a result, the SQL code that the job submits through a Teradata LIBNAME statement or through PROC SQL changes from run to run.
In these examples, Job 1 is an excellent candidate for the facility. In contrast, Job 2 is not. Using PreFetch with Job 2 does not return incorrect results, but can impose a performance penalty. PreFetch uses stored SQL statements. Therefore, Job 2 is not a good candidate because the SQL statements that the job generates with the WHERE clause change each time the job is run. Consequently, the SQL statements that the job generates never match the statements that are stored.
The impact of Prefetch on processing performance varies by SAS job. Some jobs improve elapsed time 5% or less; others improve elapsed time 25% or more.

Possible Unexpected Results

It is unlikely, but possible, to write a SAS job that delivers unexpected or incorrect results. This can occur if the job contains code that waits on some Teradata or system event before proceeding. For example, SAS code that pauses the SAS job until another user updates a given data item in a Teradata table. Another examples is SAS code that pauses the SAS job until a given time, such as 5:00 p.m. In both cases, PreFetch would generate SQL statements in advance. But, table results from these SQL statements would not reflect data changes that are made by the scheduled Teradata or system event.

PreFetch Processing of Unusual Conditions

PreFetch is designed to handle unusual conditions gracefully. Here are some of the unusual conditions that are included:
Condition: Your job contains SAS code that creates updates, or deletes Teradata tables.
PreFetch is designed only for Read operations and is disabled when it encounters a nonread operation. The facility returns a performance benefit up to the point where the first nonread operation is encountered. After that, SAS/ACCESS disables the PreFetch facility and continues processing.
Condition: Your SQL cache name (unique_storename value) is identical to the name of a Teradata table.
PreFetch issues a warning message. SAS/ACCESS disables the PreFetch facility and continues processing.
Condition: You change your SAS code for a job that has PreFetch enabled.
PreFetch detects that the SQL statements for the job changed and deletes the cache. SAS/ACCESS disables Prefetch and continues processing. The next time you run the job, PreFetch creates a fresh cache.
Condition: Your SAS job encounters a PreFetch cache that was created by a different SAS job.
PreFetch deletes the cache. SAS/ACCESS disables Prefetch and continues processing. The next time you run the job, PreFetch creates a fresh cache.
Condition: You remove the PreFetch option from an existing job.
Prefetch is disabled. Even if the SQL cache (Teradata macro) still exists in your database, SAS/ACCESS ignores it.
Condition: You accidentally delete the SQL cache (the Teradata macro created by PreFetch) for a SAS job that has enabled PreFetch.
SAS/ACCESS simply rebuilds the cache on the next run of the job. In subsequent job runs, PreFetch continues to enhance performance.

Using PreFetch as a LIBNAME Option

If you specify the PREFETCH= option in a LIBNAME statement, PreFetch applies the option to tables read by the libref.
If you have more than one LIBNAME in your SAS job, and you specify PREFETCH= for each LIBNAME, remember to make the SQL cache name for each LIBNAME unique.
This example applies PREFETCH= to one of two librefs. During the first job run, PreFetch stores SQL statements for tables referenced by the libref ONE in a Teradata macro named PF_STORE1 for reuse later.
libname one teradata user=testuser password=testpass
 prefetch='pf_store1';
libname two teradata user=larry password=riley;
This example applies PREFETCH= to multiple librefs. During the first job run, PreFetch stores SQL statements for tables that the EMP libref referenced to a Teradata macro named EMP_SAS_MACRO and SQL statements for tables that the SALE libref referenced to a Teradata macro named SALE_SAS_MACRO.
libname emp teradata user=testuser password=testpass
 prefetch='emp_sas_macro';
libname sale teradata user=larry password=riley
 prefetch='sale_sas_macro';  

Using Prefetch as a Global Option

Unlike other Teradata LIBNAME options, you can also invoke PreFetch globally for a SAS job. To do this, place the OPTION DEBUG= statement in your SAS program before all LIBNAME statements and PROC SQL steps. If your job contains multiple LIBNAME statements, the global PreFetch invocation creates a uniquely named SQL cache name for each of the librefs.
Do not be confused by the DEBUG= option here. It is merely a mechanism to deliver the PreFetch capability globally. PreFetch is not for debugging; it is a supported feature of SAS/ACCESS Interface to Teradata.
In this example the first time you run the job with PreFetch enabled, the facility creates three Teradata macros: UNIQUE_MAC1, UNIQUE_MAC2, and UNIQUE_MAC3. In subsequent runs of the job, PreFetch extracts SQL statements from these Teradata macros, enhancing the job performance across all three librefs referenced by the job.
option debug="PREFETCH(unique_mac,2,SEQUENTIAL)";
libname one teradata user=kamdar password=ellis;
libname two teradata user=kamdar password=ellis
                          database=larry;
libname three teradata user=kamdar password=ellis
                            database=wayne;
proc print data=one.kamdar_goods;
run;
proc print data=two.larry_services;
run;
proc print data=three.wayne_miscellaneous;
run; 
In this example PreFetch selects the algorithm, that is, the order of the SQL statements. (The OPTION DEBUG= statement must be the first statement in your SAS job.)
option debug='prefetch(pf_unique_sas,3)';
In this example the user specifies for PreFetch to use the SEQUENTIAL algorithm. (The OPTION DEBUG= statement must be the first statement in your SAS job.)
option debug='prefetch(sas_pf_store,3,sequential)';