Previous Page | Next Page

SAS/ACCESS Interface to DB2 Under z/OS

Bulk Loading for DB2 Under z/OS


Overview

By default, the DB2 under z/OS interface loads data into tables by preparing an SQL INSERT statement, executing the INSERT statement for each row, and issuing a COMMIT statement. You must specify BULKLOAD=YES to start the DB2 LOAD utility. You can then bulk-load rows of data as a single unit, which can significantly enhance performance. For smaller tables, the extra overhead of the bulk-loading process might slow performance. For larger tables, the speed of the bulk-loading process outweighs the overhead costs. DB2 under z/OS bulk-loading examples are available.

When you use bulk load, see the SYSPRINT output for information about the load. If you run the LOAD utility and it fails, ignore the messages in the SAS log because they might be inaccurate. However, if errors existed before you ran the LOAD utility, error messages in the SAS log might be valid.

SAS/ACCESS Interface to DB2 under z/OS provides bulk loading through DSNUTILS, an IBM stored procedure that start the DB2 LOAD utility. DSNUTILS is included in DB2 Version 6 and later, and it is available for DB2 Version 5 in a maintenance release. Because the LOAD utility is complex, familiarize yourself with it before you use it through SAS/ACCESS. Also check with your database administrator to determine whether this utility is available.


Data Set Options for Bulk Loading

Below are the DB2 under z/OSbulk-load data set options. All begin with BL_ for bulk load. To use the bulk-load facility, you must specify BULKLOAD=YES or all bulk-load options are ignored. (The DB2 under z/OS interface alias for BULKLOAD= is DB2LDUTIL=.)


File Allocation and Naming for Bulk Loading

When you use bulk loading, these files (data sets) are allocated.

All allocations of these data sets are reversed by the end of the step. If errors occur before generation of the SysRec, any of these data sets that were allocated as new and cataloged are deleted as part of cleanup because they would be empty.

The interface engine uses these options when it allocates nonexisting SYS data set names.

This table shows how SysIn and SysRec are allocated based on the values of BL_DB2LDEXT= and BL_DB2IN=, and BL_DB2REC=.

SysIn and SysRec Allocation
BL_DB2LDEXT= BL_DB2IN=/ BL_DB2REC= Data set name DISPOSITION
GENRUN not specified generated NEW, CATALOG, DELETE
GENRUN specified specified NEW, CATALOG, DELETE
GENONLY not specified generated NEW, CATALOG, DELETE
GENONLY specified specified NEW, CATALOG, DELETE
USERUN not specified ERROR
USERUN specified specified OLD, KEEP, KEEP

When SAS/ACCESS Interface to DB2 under z/OS uses existing files, you must specify the filenames. When the interface generates the files, it creates them with names you provide or with unique names it generates. Engine-generated filenames use system generated data set names with the format SYSyyddd.Thhmmss.RA000.jobname.name.Hgg where

SYSyyddd

is replaced by the user ID. The user ID used to prequalify these generated data set names is determined the same as within the rest of SAS, except when running in a server environment, where the authenticated ID of the client is used.

name

is replaced by the given SYS ddname of the data set.

For example, if you do not specify any data set names and run GENRUN under TSO, you get a set of files allocated with names such as
USERID.T125547.RA000.USERID.DB2DISC.H01
USERID.T125547.RA000.USERID.DB2ERR.H01
USERID.T125547.RA000.USERID.DB2IN.H01
USERID.T125547.RA000.USERID.DB2MAP.H01
USERID.T125547.RA000.USERID.DB2PRINT.H01
USERID.T125547.RA000.USERID.DB2REC.H01

Because it produces unique names, even within a sysplex (within one second per user ID per system), this naming convention makes it easy to associate all information for each utility execution, and to separate it from other executions.

Bulk-load files are removed at the end of the load process to save space. They are not removed if the utility fails to allow for the load process to be restarted.


Examples

Use these LIBNAME statements for all examples.

libname db2lib db2;
libname shlib db2 connection=shared;

Create a table.

data db2lib.table1 (bulkload=yes);
   x=1;
   name='Tom';
run;

Append Table1 to itself.

data shlib.table1
   (bulkload=yes bl_db2tblxst=yes bl_db2ldct1='RESUME YES');
   set shlib.table1;
run;

Replace Table1 with itself.

data shlib.table1
   (bulkload=yes bl_db2tblxst=yes bd_db2ldct1='REPLACE');
   set shlib.table1;
run;

Load DB2 tables directly from other objects.

data db2lib.emp (bulkload=yes);
   bl_db2ldct1='replace log no nocopypend' bl_db2cursor='select * from dsn8710.emp');
   set db2lib.emp (obs=0);
run;

You can also use this option in a PROC SQL statement to load DB2 tables directly from other objects, as shown below.

options sastrace=',,,d';
libname db2lib db2 authid=dsn8710;
libname mylib db2;

proc delete data mylib.emp;
run;

proc sql;
     connect to db2;
     create table mylib.emp
            (BULKLOAD=YES
             BL_DB2LDCT1='REPLACE LOG NO NOCOPYPEND'
             BL_DB2CURSOR='SELECT FIRSTNAME, LASTNAME, WORKDEPT,
                                  HIREDATE, JOB, SALARY, BONUS, COMM
                           FROM DSN8710.EMP')
     as select firstname, lastname, workdept,
               hiredate, job, salary, bonus, comm
        from db2lib.emp (obs=0);
quit;

Here is another similar example.

options sastrace=',,,d';
libname db2lib db2 authid=dsn8710;
libname mylib db2;

proc delete data mylib.emp;
run;

proc sql;
     connect to db2;
     create table mylib.emp
            (BULKLOAD=YES
             BL_DB2LDCT1='REPLACE LOG NO NOCOPYPEND'
             BL_DB2CURSOR='SELECT FIRSTNAME, LASTNAME, WORKDEPT,
                                  HIREDATE, JOB, SALARY, BONUS, COMM
                           FROM DSN8710.EMP'
             BL_DB2LDCT3='RUNSTATS TABLESPACE DSNDB04.TEMPTTABL
                          TABLE(ALL) INDEX(ALL) REPORT YES')
     as select firstname, lastname, workdept,
               hiredate, job, salary, bonus, comm
        from db2lib.emp (obs=0);
quit;

Generate control and data files, create the table, but do not run the utility to load it.

data shlib.table2 (bulkload=yes 
   bl_db2ldext=genonly bl_db2in='userid.sysin' bl_db2rec='userid.sysrec');
   set shlib.table1;
run;

Use the control and data files that you generated in the preceding example load the table. The OBS=1 data set option on the input file prevents the DATA step from reading the whole file. Because the data is really in SysRec, you need only the input file to satisfy the engine.

data db2lib.table2 (bulkload=yes bl_db2tblxst=yes
     bl_db2ldext=userun bl_db2in='userid.sysin' bl_db2rec='userid.sysrec');
   set db2lib.table1 (obs=1);
run; 

A more efficient approach than the previous example is to eliminate going to DB2 to read even one observation from the input table. This also means that the DATA step processes only one observation, without any input I/O. Note that the one variable V is not on the table. Any variables listed here (there is no need for more than one), are irrelevant because the table already exists; they are not used.

data db2lib.table2 (bulkload=yes bl_db2tblxst=yes
     bl_db2ldext=userun bl_db2in='userid.sysin' bl_db2rec='userid.sysrec');
   v=0;
run;

Generate control and data files, but do not create the table or run the utility. Setting BL_DB2TBLXST=YES when the table does not exist prevents you from creating the table; this only makes sense because you are not going to load any data into the table at this time.

data db2lib.table3 (bulkload=yes bl_db2tblxst=yes
     bl_db2ldext=genonly bl_db2in='userid.sysin' bl_db2rec='userid.sysrec');
   set db2lib.table1;
run;

Use the control and data files that you generated in the preceding example to load the table. The OBS=1 data set option on the input file prevents the DATA step from reading the whole file. In this case, you must specify the input file because it contains the column definitions that are necessary to create the table.

data shlib.table3 (bulkload=yes bl_db2ldext=userun
     bl_db2in='userid.sysin' bl_db2rec='userid.sysrec');
   set shlib.table1 (obs=1);
run;

If you know the column names, a more efficient approach than the previous example is to eliminate going to DB2 to get the column definitions. In this case, the variable names and data types must match, because they are used to create the table. However, the values specified for the variables are not included on the table, because all data to load comes from the existing SysRec.

data db2lib.table3 (bulkload=yes bl_db2ldext=userun
     bl_db2in='userid.sysin' bl_db2rec='userid.sysrec');
   x=0;
   name='???';
run;

You can use other applications that do output processing.

data work.a;
   x=1;
run;

proc sql;
   create db2lib.table4 (bulkload=yes) as select * from a;
quit;

Previous Page | Next Page | Top of Page