SAS/ACCESS Interface to DB2 Under z/OS |
Overview |
By default, the DB2 under z/OS interface loads data into tables by preparing an SQL INSERT statement, executing the INSERT statement for each row, and issuing a COMMIT statement. You must specify BULKLOAD=YES to start the DB2 LOAD utility. You can then bulk-load rows of data as a single unit, which can significantly enhance performance. For smaller tables, the extra overhead of the bulk-loading process might slow performance. For larger tables, the speed of the bulk-loading process outweighs the overhead costs. DB2 under z/OS bulk-loading examples are available.
When you use bulk load, see the SYSPRINT output for information about the load. If you run the LOAD utility and it fails, ignore the messages in the SAS log because they might be inaccurate. However, if errors existed before you ran the LOAD utility, error messages in the SAS log might be valid.
SAS/ACCESS Interface to DB2 under z/OS provides bulk loading through DSNUTILS, an IBM stored procedure that start the DB2 LOAD utility. DSNUTILS is included in DB2 Version 6 and later, and it is available for DB2 Version 5 in a maintenance release. Because the LOAD utility is complex, familiarize yourself with it before you use it through SAS/ACCESS. Also check with your database administrator to determine whether this utility is available.
Data Set Options for Bulk Loading |
Below are the DB2 under z/OSbulk-load data set options. All begin with BL_ for bulk load. To use the bulk-load facility, you must specify BULKLOAD=YES or all bulk-load options are ignored. (The DB2 under z/OS interface alias for BULKLOAD= is DB2LDUTIL=.)
BL_DB2CURSOR=
BL_DB2DATACLAS=
BL_DB2DEVT_PERM=
BL_DB2DEVT_TEMP=
BL_DB2DISC=
BL_DB2ERR=
BL_DB2IN=
BL_DB2LDCT1=
BL_DB2LDCT2=
BL_DB2LDCT3=
BL_DB2LDEXT=
BL_DB2MGMTCLAS=
BL_DB2MAP=
BL_DB2PRINT=
BL_DB2PRNLOG=
BL_DB2REC=
BL_DB2RECSP=
BL_DB2RSTRT=
BL_DB2SPC_PERM=
BL_DB2SPC_TEMP=
BL_DB2STORCLAS=
BL_DB2TBLXST=
BL_DB2UNITCOUNT=
BL_DB2UTID=
File Allocation and Naming for Bulk Loading |
When you use bulk loading, these files (data sets) are allocated.
The DB2 DSNUTILS procedure allocates these as new and catalogs the SysDisc, SysMap, and SysErr file unless BL_DB2LDEXT=USERUN (in which case the data sets are allocated as old and are kept).
The DB2 interface engine allocates as new and catalogs the files SysIn and SysRec when the execution method specifies to generate them.
The DB2 interface engine allocates as new and catalogs the file SysPrint when the execution method specifies to run the utility.
All allocations of these data sets are reversed by the end of the step. If errors occur before generation of the SysRec, any of these data sets that were allocated as new and cataloged are deleted as part of cleanup because they would be empty.
The interface engine uses these options when it allocates nonexisting SYS data set names.
DSNUTILS uses BL_DB2DEVT_PERM= and BL_DB2SPC_PERM= for SysDisc, SysMap, and SysErr.
The DB2 interface engine uses BL_DB2DEVT_PERM= for SysIn, SysRec, and SysPrint.
SysRec uses BL_DB2RECSPC=. BL_DB2RECSPC= is necessary because the engine cannot determine how much space the SysRec requires--it depends on the volume of data being loaded into the table.
DSNUTILs uses BL_DB2DEVT_TEMP= and BL_DB2SPC_TEMP= to allocate the other data set names that the LOAD utility requires.
This table shows how SysIn and SysRec are allocated based on the values of BL_DB2LDEXT= and BL_DB2IN=, and BL_DB2REC=.
BL_DB2LDEXT= | BL_DB2IN=/ BL_DB2REC= | Data set name | DISPOSITION |
---|---|---|---|
GENRUN | not specified | generated | NEW, CATALOG, DELETE |
GENRUN | specified | specified | NEW, CATALOG, DELETE |
GENONLY | not specified | generated | NEW, CATALOG, DELETE |
GENONLY | specified | specified | NEW, CATALOG, DELETE |
USERUN | not specified | ERROR |
|
USERUN | specified | specified | OLD, KEEP, KEEP |
When SAS/ACCESS Interface to DB2 under z/OS uses existing files, you must specify the filenames. When the interface generates the files, it creates them with names you provide or with unique names it generates. Engine-generated filenames use system generated data set names with the format SYSyyddd.Thhmmss.RA000.jobname.name.Hgg where
is replaced by the user ID. The user ID used to prequalify these generated data set names is determined the same as within the rest of SAS, except when running in a server environment, where the authenticated ID of the client is used.
USERID.T125547.RA000.USERID.DB2DISC.H01 USERID.T125547.RA000.USERID.DB2ERR.H01 USERID.T125547.RA000.USERID.DB2IN.H01 USERID.T125547.RA000.USERID.DB2MAP.H01 USERID.T125547.RA000.USERID.DB2PRINT.H01 USERID.T125547.RA000.USERID.DB2REC.H01
Because it produces unique names, even within a sysplex (within one second per user ID per system), this naming convention makes it easy to associate all information for each utility execution, and to separate it from other executions.
Bulk-load files are removed at the end of the load process to save space. They are not removed if the utility fails to allow for the load process to be restarted.
Examples |
Use these LIBNAME statements for all examples.
libname db2lib db2; libname shlib db2 connection=shared;
data db2lib.table1 (bulkload=yes); x=1; name='Tom'; run;
data shlib.table1 (bulkload=yes bl_db2tblxst=yes bl_db2ldct1='RESUME YES'); set shlib.table1; run;
data shlib.table1 (bulkload=yes bl_db2tblxst=yes bd_db2ldct1='REPLACE'); set shlib.table1; run;
Load DB2 tables directly from other objects.
data db2lib.emp (bulkload=yes); bl_db2ldct1='replace log no nocopypend' bl_db2cursor='select * from dsn8710.emp'); set db2lib.emp (obs=0); run;
You can also use this option in a PROC SQL statement to load DB2 tables directly from other objects, as shown below.
options sastrace=',,,d'; libname db2lib db2 authid=dsn8710; libname mylib db2; proc delete data mylib.emp; run; proc sql; connect to db2; create table mylib.emp (BULKLOAD=YES BL_DB2LDCT1='REPLACE LOG NO NOCOPYPEND' BL_DB2CURSOR='SELECT FIRSTNAME, LASTNAME, WORKDEPT, HIREDATE, JOB, SALARY, BONUS, COMM FROM DSN8710.EMP') as select firstname, lastname, workdept, hiredate, job, salary, bonus, comm from db2lib.emp (obs=0); quit;
Here is another similar example.
options sastrace=',,,d'; libname db2lib db2 authid=dsn8710; libname mylib db2; proc delete data mylib.emp; run; proc sql; connect to db2; create table mylib.emp (BULKLOAD=YES BL_DB2LDCT1='REPLACE LOG NO NOCOPYPEND' BL_DB2CURSOR='SELECT FIRSTNAME, LASTNAME, WORKDEPT, HIREDATE, JOB, SALARY, BONUS, COMM FROM DSN8710.EMP' BL_DB2LDCT3='RUNSTATS TABLESPACE DSNDB04.TEMPTTABL TABLE(ALL) INDEX(ALL) REPORT YES') as select firstname, lastname, workdept, hiredate, job, salary, bonus, comm from db2lib.emp (obs=0); quit;
Generate control and data files, create the table, but do not run the utility to load it.
data shlib.table2 (bulkload=yes bl_db2ldext=genonly bl_db2in='userid.sysin' bl_db2rec='userid.sysrec'); set shlib.table1; run;
Use the control and data files that you generated in the preceding example load the table. The OBS=1 data set option on the input file prevents the DATA step from reading the whole file. Because the data is really in SysRec, you need only the input file to satisfy the engine.
data db2lib.table2 (bulkload=yes bl_db2tblxst=yes bl_db2ldext=userun bl_db2in='userid.sysin' bl_db2rec='userid.sysrec'); set db2lib.table1 (obs=1); run;
A more efficient approach than the previous example is to eliminate going to DB2 to read even one observation from the input table. This also means that the DATA step processes only one observation, without any input I/O. Note that the one variable V is not on the table. Any variables listed here (there is no need for more than one), are irrelevant because the table already exists; they are not used.
data db2lib.table2 (bulkload=yes bl_db2tblxst=yes bl_db2ldext=userun bl_db2in='userid.sysin' bl_db2rec='userid.sysrec'); v=0; run;
Generate control and data files, but do not create the table or run the utility. Setting BL_DB2TBLXST=YES when the table does not exist prevents you from creating the table; this only makes sense because you are not going to load any data into the table at this time.
data db2lib.table3 (bulkload=yes bl_db2tblxst=yes bl_db2ldext=genonly bl_db2in='userid.sysin' bl_db2rec='userid.sysrec'); set db2lib.table1; run;
Use the control and data files that you generated in the preceding example to load the table. The OBS=1 data set option on the input file prevents the DATA step from reading the whole file. In this case, you must specify the input file because it contains the column definitions that are necessary to create the table.
data shlib.table3 (bulkload=yes bl_db2ldext=userun bl_db2in='userid.sysin' bl_db2rec='userid.sysrec'); set shlib.table1 (obs=1); run;
If you know the column names, a more efficient approach than the previous example is to eliminate going to DB2 to get the column definitions. In this case, the variable names and data types must match, because they are used to create the table. However, the values specified for the variables are not included on the table, because all data to load comes from the existing SysRec.
data db2lib.table3 (bulkload=yes bl_db2ldext=userun bl_db2in='userid.sysin' bl_db2rec='userid.sysrec'); x=0; name='???'; run;
You can use other applications that do output processing.
data work.a; x=1; run; proc sql; create db2lib.table4 (bulkload=yes) as select * from a; quit;
Copyright © 2010 by SAS Institute Inc., Cary, NC, USA. All rights reserved.