SAS/ACCESS Interface to DB2 Under UNIX and PC Hosts |
Overview |
Bulk loading is the fastest way to insert large numbers of rows into a DB2 table. Using this facility instead of regular SQL insert statements, you can insert rows two to ten times more rapidly. DB2 bulk-load examples and tips for maximizing performance are available. You must specify BULKLOAD=YES to use the bulk-load facility.
SAS/ACCESS Interface to DB2 under UNIX and PC Hosts offers LOAD, IMPORT, and CLI LOAD bulk-loading methods. The BL_REMOTE_FILE= and BL_METHOD= data set options determine which method to use.
For more information about the differences between IMPORT, LOAD, and CLI LOAD, see the DB2 Data Movement Utilities Guide and Reference.
To use the LOAD method, you must have system administrator authority, database administrator authority, or load authority on the database and the insert privilege on the table being loaded.
This method also requires that the client and server machines are able to read and write files to a common location, such as a mapped network drive or an NFS directory. To use this method, specify the BL_REMOTE_FILE= option.
Because SAS/ACCESS Interface to DB2 uses the PC/IXF file format to transfer data to the DB2 LOAD utility, you cannot use this method to load data into partitioned databases.
Here are the bulk-load options available with the LOAD method. For details about these options, see Data Set Options for Relational Databases.
BL_LOG=: The log file contains a summary of load information and error descriptions. On most platforms, the default filename is BL_<table>_<unique-ID>.log.
table |
specifies the table name |
unique-ID |
specifies a number used to prevent collisions in the event of two or more simultaneous bulk loads of a particular table. The SAS/ACCESS engine generates the number. |
The IMPORT method does not offer the same level of performance as the LOAD method, but it is available to all users who have insert privileges on the tables being loaded. The IMPORT method does not require that the server and client have a common location in order to access the data file. If you do not specify BL_REMOTE_FILE=, the IMPORT method is automatically used.
Here are the bulk-loading options available with the IMPORT method. For detailed information about these options, see Data Set Options for Relational Databases.
The CLI LOAD method is an interface to the standard DB2 LOAD utility, which gives the added performance of using LOAD but without setting additional options for bulk load. This method requires the same privileges as the LOAD method, and is available only in DB2 Version 7 FixPak 4 and later clients and servers. If your client and server can support the CLI LOAD method, you can generally see the best performance by using it. The CLI LOAD method can also be used to load data into a partitioned DB2 database for client and database nodes that are DB2 Version 8.1 or later. To use this method, specify BL_METHOD=CLILOAD as a data set option. Here are the bulk-load options that are available with the CLI LOAD method:
BL_ALLOW_READ_ACCESS
BL_ALLOW_WRITE_ACCESS
BL_CPU_PARALLELISM
BL_DATA_BUFFER_SIZE
BL_DISK_PARALLELISM
BL_EXCEPTION
Capturing Bulk-Load Statistics into Macro Variables |
These bulk-loading macro variables capture how many rows are loaded, skipped, rejected, committed, and deleted and then writes this information to the SAS log.
SYSBL_ROWSCOMMITTED
SYSBL_ROWSDELETED
SYSBL_ROWSLOADED
SYSBL_ROWSREJECTED
SYSBL_ROWSSKIPPED
Maximizing Load Performance for DB2 Under UNIX and PC Hosts |
These tips can help you optimize LOAD performance when you are using the DB2 bulk-load facility:
Specifying BL_REMOTE_FILE= causes the loader to use the DB2 LOAD utility, which is much faster than the IMPORT utility, but it requires database administrator authority.
Performance might suffer if your setting for DBCOMMIT= is too low. Increase the default (which is 10000 when BULKLOAD=YES) for improved performance.
Increasing the DB2 tuning parameters, such as Utility Heap and I/O characteristics, improves performance. These parameters are controlled by your database or server administrator.
When using the IMPORT utility, specify BL_OPTIONS="COMPOUND=x" where x is a number between 1 and 7 on Windows, and between 1 and 100 on UNIX. This causes the IMPORT utility to insert multiple rows for each execute instead of one row per execute.
When using the LOAD utility on a multi-processor or multi-node DB2 server, specify BL_OPTIONS="ANYORDER" to improve performance. Note that this might cause the entries in the DB2 log to be out of order (because it enables DB2 to insert the rows in an order that is different from how they appear in the loader data file).
Examples |
The following example shows how to use a SAS data set, SASFLT.FLT98, to create and load a large DB2 table, FLIGHTS98. Because the code specifies BULKLOAD=YES and BL_REMOTE_FILE= is omitted, this load uses the DB2 IMPORT command.
libname sasflt 'SAS-data-library'; libname db2_air db2 user=louis using=fromage database='db2_flt' schema=statsdiv; proc sql; create table db2_air.flights98 (bulkload=YES bl_options='compound=7 norowwarnings') as select * from sasflt.flt98; quit;
The BL_OPTIONS= option passes DB2 file type modifiers to DB2. The norowwarnings modifier indicates that all row warnings about rejected rows are to be suppressed.
The following example shows how to append the SAS data set, SASFLT.FLT98 to a preexisting DB2 table, ALLFLIGHTS. Because the code specifies BULKLOAD=YES and BL_REMOTE_FILE=, this load uses the DB2 LOAD command.
proc append base=db2_air.allflights (BULKLOAD=YES BL_REMOTE_FILE='/tmp/tmpflt' BL_LOG='/tmp/fltdata.log' BL_DATAFILE='/nfs/server/tmp/fltdata.ixf' BL_SERVER_DATAFILE='/tmp/fltdata.ixf') data=sasflt.flt98; run;
Here, BL_REMOTE_FILE= and BL_SERVER_DATAFILE= are paths relative to the server. BL_LOG= and BL_DATAFILE= are paths relative to the client.
The following example shows how to use the SAS data set SASFLT.ALLFLIGHTS to create and load a large DB2 table, ALLFLIGHTS. Because the code specifies BULKLOAD=YES and BL_METHOD=CLILOAD, this operation uses the DB2 CLI LOAD interface to the LOAD command.
data db2_air.allflights(BULKLOAD=YES BL_METHOD=CLILOAD); set sasflt.allflights; run;
Copyright © 2010 by SAS Institute Inc., Cary, NC, USA. All rights reserved.