Bulk Loading for DB2 under UNIX and PC Hosts

Overview

Bulk loading is the fastest way to insert large numbers of rows into a DB2 table. Using this facility instead of regular SQL insert statements, you can insert rows two to ten times more rapidly. DB2 bulk-load examples and tips for maximizing performance are available. You must specify BULKLOAD=YES to use the bulk-load facility.
SAS/ACCESS Interface to DB2 under UNIX and PC Hosts offers LOAD, IMPORT, and CLI LOAD bulk-loading methods. The BL_REMOTE_FILE= and BL_METHOD= data set options determine which method to use.
For more information about the differences between IMPORT, LOAD, and CLI LOAD, see the DB2 Data Movement Utilities Guide and Reference.

Using the LOAD Method

To use the LOAD method, you must have system administrator, database administrator, or load authority on the database and the INSERT privilege on the table to be loaded.
This method also requires that client and server machines can read and write files to a common location such as a mapped network drive or an NFS directory. To use this method, specify the BL_REMOTE_FILE= option.
Note: Because SAS/ACCESS Interface to DB2 uses the PC/IXF file format to transfer data to the DB2 LOAD utility, you cannot use this method to load data into partitioned databases.
Here are the bulk-load options available with the LOAD method. For details about these options, see Overview.

Using the IMPORT Method

The IMPORT method does not offer the same level of performance as the LOAD method. However, it is available to all users with INSERT privileges for the tables to be loaded. The IMPORT method does not require that the server and client have a common location in order to access the data file. If you do not specify BL_REMOTE_FILE=, the IMPORT method is automatically used.
Here are the bulk-loading options available with the IMPORT method. For detailed information about these options, see Overview.

Using the CLI LOAD Method

The CLI LOAD method is an interface to the standard DB2 LOAD utility, which gives the added performance of using LOAD but without setting additional options for bulk load. This method requires the same privileges as the LOAD method, and is available only in DB2 Version 7 FixPak 4 and later clients and servers. If your client and server can support the CLI LOAD method, you can generally see the best performance by using it. The CLI LOAD method can also be used to load data into a partitioned DB2 database for client and database nodes that are DB2 Version 8.1 or later. To use this method, specify BL_METHOD=CLILOAD as a data set option. Here are the bulk-load options that are available with the CLI LOAD method:

Capturing Bulk-Load Statistics into Macro Variables

These bulk-loading macro variables capture how many rows are loaded, skipped, rejected, committed, and deleted before writing this information to the SAS log.
  • SYSBL_ROWSCOMMITTED
  • SYSBL_ROWSDELETED
  • SYSBL_ROWSLOADED
  • SYSBL_ROWSREJECTED
  • SYSBL_ROWSSKIPPED

Maximizing Load Performance for DB2 under UNIX and PC Hosts

These tips can help you optimize LOAD performance when you are using the DB2 bulk-load facility.
  • Specifying BL_REMOTE_FILE= causes the loader to use the DB2 LOAD utility, which is much faster than the IMPORT utility, but it requires database administrator authority.
  • Performance might suffer if your setting for DBCOMMIT=is too low. Increase the default (which is 10000 when BULKLOAD=YES) for improved performance.
  • Increasing the DB2 tuning parameters, such as Utility Heap and input-output characteristics, improves performance. These parameters are controlled by your database or server administrator.
  • When using the IMPORT utility, specify BL_OPTIONS="COMPOUND=x"—where x is a number between 1 and 7 on Windows, and between 1 and 100 on UNIX. This causes the IMPORT utility to insert multiple rows for each execute instead of one row per execute.
  • When using the LOAD utility on a multi-processor or multi-node DB2 server, specify BL_OPTIONS="ANYORDER" to improve performance. This might cause DB2 log entries to be out of order because it lets DB2 insert rows in a different order from how they appear in the loader data file.

Examples

The following example shows how to use a SAS data set, SASFLT.FLT98, to create and load a large DB2 table, FLIGHTS98. Because the code specifies BULKLOAD=YES and BL_REMOTE_FILE= is omitted, this load uses the DB2 IMPORT command.
libname sasflt 'SAS-library';
libname db2_air db2 user=louis using=fromage
        database='db2_flt' schema=statsdiv;

proc sql;
create table db2_air.flights98
       (bulkload=YES bl_options='compound=7 norowwarnings')
        as select * from sasflt.flt98;
quit;
The BL_OPTIONS= option passes DB2 file type modifiers to DB2. The norowwarnings modifier indicates that all row warnings about rejected rows are to be suppressed.
The following example shows how to append the SAS data set, SASFLT.FLT98 to a pre-existing DB2 table, ALLFLIGHTS. Because the code specifies BULKLOAD=YES and BL_REMOTE_FILE=, this load uses the DB2 LOAD command.
proc append base=db2_air.allflights
 (BULKLOAD=YES
  BL_REMOTE_FILE='/tmp/tmpflt'
  BL_LOG='/tmp/fltdata.log'
  BL_DATAFILE='/nfs/server/tmp/fltdata.ixf'
  BL_SERVER_DATAFILE='/tmp/fltdata.ixf')
data=sasflt.flt98;
run;
Here, BL_REMOTE_FILE= and BL_SERVER_DATAFILE= are paths relative to the server. BL_LOG= and BL_DATAFILE= are paths relative to the client.
The following example shows how to use the SAS data set SASFLT.ALLFLIGHTS to create and load a large DB2 table, ALLFLIGHTS. Because the code specifies BULKLOAD=YES and BL_METHOD=CLILOAD, this operation uses the DB2 CLI LOAD interface to the LOAD command.
data db2_air.allflights(BULKLOAD=YES BL_METHOD=CLILOAD);
set sasflt.allflights;
run;