Previous Page | Next Page

SAS/ACCESS Interface to Teradata

Maximizing Teradata Load Performance


Overview

To significantly improve performance when loading data, SAS/ACCESS Interface to Teradata provides these facilities. These correspond to native Teradata utilities.

SAS/ACCESS also supports the Teradata Protocol Transporter application programming interface (TPT API), which you can also use with these facilities.


Using FastLoad


FastLoad Supported Features and Restrictions

SAS/ACCESS Interface to Teradata supports a bulk-load capability called FastLoad that greatly accelerates insertion of data into empty Teradata tables. For general information about using FastLoad and error recovery, see the Teradata FastLoad documentation. SAS/ACCESS examples are available.

Note:   Implementation of SAS/ACCESS FastLoad facility will change in a future release of SAS. So you might need to change SAS programming statements and options that you specify to enable this feature in the future.  [cautionend]

The SAS/ACCESS FastLoad facility is similar to the native Teradata FastLoad Utility. They share these limitations:


Starting FastLoad

If you do not specify FastLoad, your Teradata tables are loaded normally (slowly). To start FastLoad in the SAS/ACCESS interface, you can use one of these items:


FastLoad Data Set Options

Here are the data set options that you can use with the FastLoad facility.

To see whether threaded reads are actually generated, turn on SAS tracing by setting OPTIONS SASTRACE=",,,d" in your program.


Using MultiLoad


MultiLoad Supported Features and Restrictions

SAS/ACCESS Interface to Teradata supports a bulk-load capability called MultiLoad that greatly accelerates insertion of data into Teradata tables. For general information about using MultiLoad with Teradata tables and for information about error recovery, see the Teradata MultiLoad documentation. SAS/ACCESS examples are available.

Unlike FastLoad, which only loads empty tables, MultiLoad loads both empty and existing Teradata tables. If you do not specify MultiLoad, your Teradata tables are loaded normally (inserts are sent one row at a time).

The SAS/ACCESS MultiLoad facility loads both empty and existing Teradata tables. SAS/ACCESS supports these features:

Because the SAS/ACCESS MultiLoad facility is similar to the native Teradata MultiLoad utility, they share a limitation in that you must drop the following items on the target tables before the load:

Both the Teradata MultiLoad utility and the SAS/ACCESS MultiLoad facility log data errors to tables. Error recovery can be difficult, but the ability to restart from the last checkpoint is possible. To find the error that corresponds to the code that is stored in the error table, see the Teradata MultiLoad documentation.


MultiLoad Setup

Here are the requirements for using the MultiLoad bulk-load capability in SAS.

If it has not been done so already as part of the post-installation configuration process, see the SAS configuration documentation for your system for information about how to configure SAS to work with MultiLoad.

MultiLoad Data Set Options

Call the SAS/ACCESS MultiLoad facility by specifying MULTILOAD=YES. See the MULTILOAD= data set option for detailed information and examples on loading data and recovering from errors during the load process.

Here are the data set options that are available for use with the MultiLoad facility. For detailed information about these options, see Data Set Options for Relational Databases.

Be aware that these options are disabled while you are using the SAS/ACCESS MultiLoad facility.

To see whether threaded reads are actually generated, turn on SAS tracing by setting OPTIONS SASTRACE=",,,d" in your program.


Using the TPT API


TPT API Supported Features and Restrictions

SAS/ACCESS Interface to Teradata supports the TPT API for loading data. The TPT API provides a consistent interface for Fastload, MultiLoad, and Multi-Statement insert. TPT API documentation refers to Fastload as the load driver, MultiLoad as the update driver, and Multi-Statement insert as the stream driver. SAS supports all three load methods and can restart loading from checkpoints when you use the TPT API with any of them.


TPT API Setup

Here are the requirements for using the TPT API in SAS for loading SAS.

The SAS configuration document for your system contains information about how to configure SAS to work with the TPT API. However, those steps might already have been completed as part of the post-installation configuration process for your site.


TPT API LIBNAME Options

The TPT= LIBNAME option is common to all three supported load methods. If SAS cannot use the TPT API, it reverts to using Fastload, MultiLoad, or Multi-Statement insert, depending on which method of loading was requested without generating any errors.


TPT API Data Set Options

These data set options are common to all three supported load methods:


TPT API FastLoad Supported Features and Restrictions

SAS/ACCESS Interface to Teradata supports the TPT API for FastLoad, also known as the load driver, SAS/ACCESS works by interfacing with the load driver through the TPT API, which in turn uses the Teradata Fastload protocol for loading data. See your Teradata documentation for more information about the load driver.

This is the default FastLoad method. If SAS cannot find the Teradata modules that are required for the TPT API or TPT=NO, then SAS/ACCESS uses the old method of Fastload. SAS/ACCESS can restart Fastload from checkpoints when FastLoad uses the TPT API. The SAS/ACCESS FastLoad facility using the TPT API is similar to the native Teradata FastLoad utility. They share these limitations.


Starting FastLoad with the TPT API

See the SAS configuration document for instructions on setting up the environment so that SAS can find the TPT API modules.

You can use one of these options to start FastLoad in theSAS/ACCESS interface using the TPT API:


FastLoad with TPT API Data Set Options

These data set options are specific to FastLoad using the TPT API:


TPT API MultiLoad Supported Features and Restrictions

SAS/ACCESS Interface to Teradata supports the TPT API for MultiLoad, also known as the update driver. SAS/ACCESS works by interfacing with the update driver through the TPT API. This API then uses the Teradata Multiload protocol for loading data. See your Teradata documentation for more information about the update driver.

This is the default MultiLoad method. If SAS cannot find the Teradata modules that are required for the TPT API or TPT=NO, then SAS/ACCESS uses the old method of MultiLoad. SAS/ACCESS can restart Multiload from checkpoints when MultiLoad uses the TPT API.

The SAS/ACCESS MultiLoad facility loads both empty and existing Teradata tables. SAS/ACCESS supports only insert operations and loading only one target table at time.

The SAS/ACCESS MultLoad facility using the TPT API is similar to the native Teradata MultiLoad utility. A common limitation that they share is that you must drop these items on target tables before the load:

Errors are logged to Teradata tables. Error recovery can be difficult if you do not set TPT_CHECKPOINT_DATA= to enable restart from the last checkpoint. To find the error that corresponds to the code that is stored in the error table, see your Teradata documentation. You can restart a failed job for the last checkpoint by following the instructions in the SAS error log.


Starting MultiLoad with the TPT API

See the SAS configuration document for instructions on setting up the environment so that SAS can find the TPT API modules.

You can use one of these options to start MultiLoad in the SAS/ACCESS interface using the TPT API:


MultiLoad with TPT API Data Set Options

These data set options are specific to MultiLoad using the TPT API:


TPT API Multi-Statement Insert Supported Features and Restrictions

SAS/ACCESS Interface to Teradata supports the TPT API for Multi-Statement insert, also known as the stream driver. SAS/ACCESS works by interfacing with the stream driver through the TPT API, which in turn uses the Teradata Multi-Statement insert (TPump) protocol for loading data. See your Teradata documentation for more information about the stream driver.

This is the default Multi-Statement insert method. If SAS cannot find the Teradata modules that are required for the TPT API or TPT=NO, then SAS/ACCESS uses the old method of Multi-Statement insert. SAS/ACCESS can restart Multi-Statement insert from checkpoints when Multi-Statement insert uses the TPT API.

The SAS/ACCESS Multi-Statement insert facility loads both empty and existing Teradata tables. SAS/ACCESS supports only insert operations and loading only one target table at time.

Errors are logged to Teradata tables. Error recovery can be difficult if you do not set TPT_CHECKPOINT_DATA= to enable restart from the last checkpoint. To find the error that corresponds to the code that is stored in the error table, see your Teradata documentation. You can restart a failed job for the last checkpoint by following the instructions on the SAS error log.


Starting Multi-Statement Insert with the TPT API

See the SAS configuration document for instructions on setting up the environment so that SAS can find the TPT API modules.

You can use one of these options to start Multi-Statement in the SAS/ACCESS interface using the TPT API:


Multi-Statement Insert with TPT API Data Set Options

These data set options are specific to Multi-Statement insert using the TPT API.


Examples

This example starts the FastLoad facility.

libname fload teradata user=testuser password=testpass;
data fload.nffloat(bulkload=yes);
     do x=1 to 1000000;
        output;
     end;
run;

This next example uses FastLoad to append SAS data to an empty Teradata table and specifies the BL_LOG= option to name the error tables Append_Err1 and Append_Err2. In practice, applications typically append many rows.

/* Create the empty Teradata table. */
proc sql;
     connect to teradata as tera(user=testuser password=testpass);
     execute (create table performers 
             (userid int, salary decimal(10,2), job_desc char(50))) 
              by tera;
     execute (commit) by tera;
quit;

/* Create the SAS data to load. */
data local;
     input userid 5. salary 9. job_desc $50.;
           datalines;
             0433 35993.00 grounds keeper
             4432 44339.92 code groomer
             3288 59000.00 manager
             ;

/* Append the SAS data & name the Teradata error tables. */
libname tera teradata user=testuser password=testpass;

proc append data=local base=tera.performers
     (bulkload=yes bl_log=append_err);
run;

This example starts the MultiLoad facility.

libname trlib teradata user=testuser pw=testpass server=dbc;

/* Use MultiLoad to load a table with 2000 rows. */
data trlib.mlfloat(MultiLoad=yes);
     do x=1 to 2000;
        output;
     end;
run;

/* Append another 1000 rows. */
data work.testdata;
  do x=2001 to 3000;
     output;
  end;
run;

/* Append the SAS data to the Teradata table. */
proc append data=work.testdata base=trlib.mlfload
     (MultiLoad=yes);
run;

This example loads data using TPT FastLoad.

/* Check the SAS log for this message to verify that the TPT API was used. 
NOTE:  Teradata connection:  TPT Fastload has inserted 100 rows. 
*/
data trlib.load(TPT=YES FASTLOAD=YES);
        do x=1 to 1000;
                output;
        end;
run;

This example restarts a MultiLoad step that recorded checkpoints and failed after loading 2000 rows of data.

proc append data=trlib.load(TPT=YES MULTILOAD=YES 
    TPT_RESTART=YES TPT_CHECKPOINT_DATA=2000)
data=work.inputdata(FIRSTOBS=2001);
run;

Previous Page | Next Page | Top of Page