Selecting and Combining IMS Data

Methods to Selecting and Combining IMS Data

A great majority of SAS programs select and combine data from various sources. The method that you use depends on the configuration of the data. The next three examples show you how to select and combine data using two different methods: the WHERE statement used in a DATA step and the SQL procedure. When choosing between these methods, you should first read the performance considerations discussed in Advanced User Topics for the SAS/ACCESS Interface View Engine for IMS.

Selecting and Combining Data Using the WHERE Statement

Suppose you have two view descriptors, Vlib.ChkCrd and Vlib.ChkDeb, that contain information about the checking accounts of customers. The view descriptor Vlib.ChkCrd describes the checking credit data in the CUSTOMER, CHCKACCT, and CHCKCRDT segments, and the view descriptor Vlib.ChkDeb describes the checking debit data in the CUSTOMER, CHCKACCT, and CHCKDEBT segments. You could use the SET statement to concatenate the data in these files and create a SAS data file that contains information about checking account transactions by customer. Since you are accessing the same database more than once, you need to reference the same PSB in both view descriptors, but use different PCB index values, where each value references an AcctDBD PCB that is sensitive to the segments defined in the view. In this example, Vlib.ChkCrd uses a PCB index value of 2, and Vlib.ChkDeb uses a PCB index value of 3 in the ACCUPSB PSB.
The PROC SORT statement orders the accounts by Social Security number and checking account number.
data chktrans (keep=soc_sec_number 
  check_account_number trantype date amount);
   length trantype $ 6;
   format date date9. amount dollar12.2;
   set vlib.chkcrd(in=crd) vlib.chkdeb(in=dbt);
   where check_balance>0;
   if crd then do;
      trantype='Credit';
      date=check_credit_date;
      amount=check_credit_amount;
   end;
   else if dbt then do;
      trantype='Debit';
      date=check_debit_date;
      amount=check_debit_amount;
   end
run;
proc sort;
   by soc_sec_number check_account_number;
run;

options nodate linesize=80;

proc print data=chktrans;
   by soc_sec_number;
   var check_account_number trantype date amount;
   title2 'Checking Account Transactions by SSN';
   run;
In the SAS WHERE statement, be sure to use the IMS item name as the search criteria when VALIDVARNAME=V7 and the SAS variable name when VALIDVARNAME=V6. The following output shows the result of the new temporary SAS data file Work.ChkTrans.
Results of Selecting and Combining Data Using the WHERE Statement
                      Checking Account Transactions by SSN                  
                                                                            
   ----------------------  SOC_SEC_NUMBER=156-45-5672  --------------------
                                                                            
                  CHECK_ACCOUNT_
          OBS        NUMBER       TRANTYPE         DATE          AMOUNT     
                                                                            
           1      345620123456     Credit     01APR1991         $100.00     
           2      345620123456     Debit      28MAR1991          $13.29     
           3      345620123456     Debit      31MAR1991          $32.87     
           4      345620123456     Debit      02APR1991          $50.00     
           5      345620123456     Debit      31MAR1991          $13.42     
                                                                            
                                                                            
   ----------------------  SOC_SEC_NUMBER=178-42-6534  --------------------
  
                  CHECK_ACCOUNT_
          OBS        NUMBER       TRANTYPE         DATE          AMOUNT     

           6      745920057114     Credit     12JUN1991       $1,300.00     
           7      745920057114     Debit      10JUN1991          $25.89     
                                                                            
                                                                          
   ----------------------  SOC_SEC_NUMBER=234-74-4612  --------------------
                                                                            
                  CHECK_ACCOUNT_
          OBS        NUMBER       TRANTYPE         DATE          AMOUNT     
                                                                            
           8      345620113263     Credit     31MAR1991         $672.32     
           9      345620113263     Debit              .             .       
                                                                            
                                                                            
    ----------------------  SOC_SEC_NUMBER=434-62-1224  --------------------
                                                                            
                  CHECK_ACCOUNT_
          OBS        NUMBER       TRANTYPE         DATE          AMOUNT      
                                                                            
          10      345620134564     Credit     22MAR1991       $1,342.42     
          11      345620134564     Debit      18MAR1991         $432.87     
          12      345620134564     Debit      18MAR1991          $19.23     
          13      345620134564     Debit      22MAR1991         $723.23     
          14      345620134564     Debit      22MAR1991          $82.32     
          15      345620134564     Debit      26MAR1991          $73.62   
          16      345620134564     Debit      26MAR1991          $31.23     
          17      345620134564     Debit      29MAR1990         $162.87     
          18      345620134564     Debit      29MAR1991           $7.12     
          19      345620134564     Debit      31MAR1991          $62.34     
          20      345620134663     Credit     28MAR1991         $120.00     
          21      345620134663     Debit      28MAR1991          $25.00     
                                                                            
                                                                            
   ----------------------  SOC_SEC_NUMBER=434-62-1234  --------------------
                                                                            
                  CHECK_ACCOUNT_
          OBS        NUMBER       TRANTYPE         DATE          AMOUNT     
                                                                            
          22      345620104732     Credit     02APR1991         $400.00     
          23      345620104732     Debit              .             .       
                                                                            
                                                                            
  
                               
 
   ----------------------  SOC_SEC_NUMBER=436-42-6394  --------------------
                                                                            
                  CHECK_ACCOUNT_
          OBS        NUMBER       TRANTYPE         DATE          AMOUNT     
                                                                            
          24      345620135872     Credit     02APR1991          $50.00     
          25      345620135872     Debit      30MAR1990             .       
                                                                            
                                                                            
   ----------------------  SOC_SEC_NUMBER=456-45-3462  --------------------
                                                                            
                  CHECK_ACCOUNT_
          OBS        NUMBER       TRANTYPE         DATE          AMOUNT     
                                                                             
          26      345620134522     Credit     05APR1991          $50.00     
          27      345620134522     Debit      29MAR1991          $42.73     
          28      345620134522     Debit      29MAR1991         $172.45     
          29      345620134522     Debit      30MAR1991          $38.23  
          30      345620134522     Debit      02APR1991          $10.00     
                                                                            
                                                                            
   ----------------------  SOC_SEC_NUMBER=657-34-3245  --------------------
                                                                            
                  CHECK_ACCOUNT_
          OBS        NUMBER       TRANTYPE         DATE          AMOUNT     
                                                                             
          31      345620131455     Credit     04APR1991         $230.00     
          32      345620131455     Debit              .             .       
                                                                            
                                                                            
   ----------------------  SOC_SEC_NUMBER=667-73-8275  --------------------
                                                                            
                  CHECK_ACCOUNT_
          OBS        NUMBER       TRANTYPE         DATE          AMOUNT     
                                                                            
          33      345620145345     Credit     31MAR1991       $1,563.23     
          34      345620145345     Debit      19MAR1990             .       
          35      345620145345     Debit      23MAR1991         $820.00     
          36      345620145345     Debit      23MAR1991          $52.00     
          37      345620145345     Debit      28MAR1991         $193.00     
          38      345620154633     Credit     31MAR1991       $1,563.23     
          39      345620154633     Debit              .             .       
                                                                            
                                                                            
   ----------------------  SOC_SEC_NUMBER=667-82-8275  --------------------
                                                                            
                CHECK_ACCOUNT_
          OBS      NUMBER         TRANTYPE         DATE          AMOUNT     
                                                                            
          40      382957492811     Credit     16APR1991         $100.00     
          41      382957492811     Debit              .             .       
                                                                              
The first line of the DATA step uses the KEEP= data set option. This option works with view descriptors just as it works with other SAS data sets. The KEEP= option specifies that you want only the listed variables included in the new SAS data file Work.ChkTrans, although you can use the other variables in the view descriptor within the DATA step. Note that the KEEP= option does not reduce the number of variables mapped by the view descriptor and therefore does not reduce the amount of data read by the engine.
When you reference a view descriptor in a SAS procedure or DATA step, it is more efficient to use a SAS WHERE statement than a subsetting IF statement because an IF statement does not reduce the amount of data read. A DATA step or SAS procedure passes the SAS WHERE statement to the interface view engine, which attempts to create SSAs from the WHERE statement. If the engine can create the SSAs, it processes the SAS WHERE statement and returns to SAS only the data that satisfies the WHERE statement. Otherwise, all the data referenced by the view descriptor is returned to SAS for processing. Processing IMS data using a WHERE statement that the IMS engine can turn into SSAs reduces the amount of data read and retrieved by the engine. This improves engine performance significantly. For more information about how IMS handles WHERE statements, see Performance and Efficient View Descriptors.
For more information about the SAS WHERE statement, see SAS Statements: Reference.

Selecting and Combining Data Using the SAS SQL Procedure

This section provides two examples of using the SAS SQL procedure on IMS data. The SQL procedure implements the Structured Query Language (SQL) in SAS 7 and later. The SQL procedure is a good way to perform SQL operations with IMS, which by itself has no SQL capabilities. The first example illustrates how to use PROC SQL to combine data from three sources. The second example shows how to use the GROUP BY clause to create new items from data that is described by a view descriptor.

Combining Data from Various Sources

Suppose you have the following items:
  • a view descriptor, Vlib.CustAcct, that is based on the CUSTOMER and CHCKACCT segments of the IMS database AcctDBD.
  • a SAS data file, MyData.ChgData, which contains checking account numbers and checking fees.
  • a view descriptor, MyData.BankChrg, that is based on data in a DB2 table that contains additional banking fees. (The MyData.BankChrg view descriptor has been created using the SAS/ACCESS interface to DB2.)
You can use PROC SQL to create a view that joins all these sources of data. When you use the PROC SQL view in your SAS program, the joined data is presented in a single output table. In this example, using the SAS WHERE or subsetting IF statements would not be an appropriate way of presenting data from various sources because you want to compare variables from several sources rather than simply merge or concatenate the data. For more information about the DB2 table that is used in this example, see Example Data.
CAUTION:
When you use PROC SQL to access more than one IMS database, the view descriptors for each database must use the same PSB.
In addition, a PCB must be included in that PSB for each database that you want to access. If you are accessing the same database multiple times, each view descriptor must specify a different PCB using the PCB index field.
The following code prints the view descriptors and the SAS data file:
options nodate linesize=120;

proc print data=vlib.custacct;
   title2 'Data Described by VLIB.CUSTACCT';
run;

options nodate linesize=80;

proc print data=mydata.bankchrg;
   title2 'Data Described by MYDATA.BANKCHRG';
run;
proc print data=mydata.chgdata;
   title2 'SAS Data File MYDATA.CHGDATA';
run;
The following three outputs show the results of the PRINT procedure performed on the Vlib.CustAcct view descriptor (based on IMS data), the MyData.BankChrg view descriptor (based on DB2 data), and the MyData.ChgData data file.
Data That Is Described by Vlib.CustAcct
                               The SAS System                     
                          Data Described by VLIB.CUSTACCT

             SOC_SEC_                                      CHECK_ACCOUNT_
      OBS    NUMBER         CUSTOMER_NAME                  NUMBER

      1    667-73-8275    WALLS, HOOPER J.                 345620145345
      2    667-73-8275    WALLS, HOOPER J.                 345620154633
      3    434-62-1234    SUMMERS, MARY T.                 345620104732
      4    436-42-6394    BOOKER, APRIL M.                 345620135872
      5    434-62-1224    SMITH, JAMES MARTIN              345620134564
      6    434-62-1224    SMITH, JAMES MARTIN              345620134663
      7    178-42-6534    PATTILLO, RODRIGUES              745920057114
      8    156-45-5672    O'CONNOR, JOSEPH                 345620123456
      9    657-34-3245    BARNHARDT, PAMELA S.             345620131455
     10    667-82-8275    COHEN, ABRAHAM                   382957492811
     11    456-45-3462    LITTLE, NANCY M.                 345620134522
     12    234-74-4612    WIKOWSKI, JONATHAN S.           
345620113263
Data That Is Described by MyData.BankChrg
                                 The SAS System                 
                       Data Described by MYDATA.BANKCHRG

    OBS        ssn              accountn    chckchrg    atmfee    loanchrg

      1    667-73-8275      345620145345      3.75       5.00        2.00 
      2    434-62-1234      345620104732     15.00      25.00      552.23 
      3    436-42-6394      345620135872      1.50       7.50      332.15 
      4    434-62-1224      345620134564      9.50       0.00        0.00 
      5    178-42-6534                 .      0.50      15.00      223.77 
      6    156-45-5672      345620123456      0.00       0.00        0.00 
      7    657-34-3245      345620132455     10.25      10.00      100.00 
      8    667-82-8275                 .      7.50       7.50      175.75 
      9    456-45-3462      345620134522     23.00      30.00      673.23 
     10    234-74-4612      345620113262      4.50       7.00        0.00
Data in the SAS Data File MyData.ChgData
                The SAS System
            SAS Data File MYDATA.CHGDATA


          OBS           account     charge

           1       345620135872        $10
           2       345620134522         $7
           3       345620123456        $12
           4       382957492811         $3
           5       345620134663         $8
           6       345620131455         $6
           7       345620104732         $9
The following SAS statements select and combine data from these three sources to create a PROC SQL view, SQL.CHARGES. The SQL.CHARGES view retrieves checking fee information so that the bank can charge customers for checking services.
options nodate linesize=132;
libname sql 'SAS-data-library';

proc sql;
  create view sql.charges as
     select distinct custacct.soc_sec_number,
            custacct.customer_name,
            custacct.check_account_number,
            chgdata.charge,
            bankchrg.chckchrg,
            bankchrg.atmfee,
            bankchrg.loanchrg
     from vlib.custacct,
          mydata.bankchrg,
          mydata.chgdata
     where custacct.soc_sec_number=bankchrg.ssn and
       custacct.check_account_number=chgdata.account;
title2 'Banking Charges for the Month';
select * from sql.charges;
The CREATE statement incorporates a WHERE clause along with the SELECT clause. The last SELECT statement retrieves and displays the PROC SQL view SQL.CHARGES. To select all the items from the view, use an asterisk (*) in place of item names. When an asterisk is used, the order of the items displayed matches the order of the items as specified in the SQL.CHARGES view definition. Notice that PROC SQL prints the output automatically to the display using the IMS item names instead of the SAS variable names. It also executes without a RUN statement when the procedure is submitted. The following output shows the data that is described by the PROC SQL view SQL.CHARGES.
Results of Combining Data from Various Sources
                                         The SAS System
                                        Banking Charges for the Month

            SOC_SEC_                              CHECK_ACCOUNT_
              NUMBER        CUSTOMER_NAME                 NUMBER   charge chckchrg  atmfee  loanchrg 
          ----------------------------------------------------------------------------------------
            156-45-5672   O'CONNOR, JOSEPH         345620123456      $12     0.00    0.00      0.00
            434-62-1224   SMITH, JAMES MARTIN      345620134663       $8     9.50    0.00      0.00
            434-62-1234   SUMMERS, MARY T.         345620104732       $9    15.00   25.00    552.23
            436-42-6394   BOOKER, APRIL M.         345620135872      $10     1.50    7.50    332.15
            456-45-3462   LITTLE, NANCY M.         345620134522       $7    23.00   30.00    673.23
            657-34-3245   BARNHARDT, PAMELA S.     345620131455       $6    10.25   10.00    100.00
            667-82-8275   COHEN, ABRAHAM           382957492811       $3     7.50    7.50    175.75

Creating New Items with the GROUP BY Clause

It is often useful to create new items with summary or aggregate functions such as the SUM function. Although you cannot use the ACCESS procedure to create new items, you can easily use the SQL procedure with data that is described by a view descriptor to display output that contains new items.
This example uses PROC SQL to retrieve and manipulate data from the view descriptor Vlib.SaveBal, which is based on the CUSTOMER and SAVEACCT segments in the AcctDBD database. When this query (as a SELECT statement is often called) is submitted, it calculates and displays the average savings account balance for each city.
options nodate linesize=80;

proc sql;
   title2 'Average Savings Balance Per City';
   select distinct city,
       avg(savings_balance) label='Average Balance'
          format=dollar12.2
   from vlib.savebal
   where city is not missing
   group by city;
The following output shows the query's result.
Results of Creating New Items with the GROUP BY Clause
                    The SAS System 
              Average Savings Balance Per City

                                           Average
        CITY                               Balance
        ------------------------------------------
        CHARLOTTESVILLE                  $1,673.35
        GORDONSVILLE                     $4,758.26
        ORANGE                             $615.60
        RAPIDAN                            $672.63
        RICHMOND                           $924.62
For more information about the SQL procedure, see the SAS SQL Procedure User's Guide.