Selecting and Combining ADABAS Data

Methods for Selecting and Combining ADABAS Data

The great majority of SAS programs select and combine data from various sources. The method that you use depends on the configuration of the data. The next three examples show you how to select and combine data using two different methods. When choosing between these methods, you should consider the issues described in Performance Considerations.

Selecting and Combining Data Using the WHERE Statement

Suppose you have two view descriptors, VLIB.USAINV and VLIB.FORINV, that list the invoices for USA and foreign customers, respectively. You can use the SET statement to concatenate these files into a SAS data file containing information about customers who have not paid their bills and whose bills amount to at least $300,000.
The following example contains the code to create the SAS data file containing the information that you want on the customers.
data notpaid(keep=invoicen billedto amtbille 
     billedon paidon);
   set vlib.usainv vlib.forinv;
   where paidon is missing and 
      amtbille>=300000;
run;
proc print;
   title "High Bills--Not Paid";
run;
In the SAS WHERE statement, you must use the SAS variable names, not the ADABAS data field names. Both VLIB.USAINV and VLIB.FORINV access data in the NATURAL DDM named INVOICE. The following output shows the result of the new temporary data file, WORK.NOTPAID.
Results of Selecting and Combining Data Using a WHERE Statement
                              High Bills--Not Paid                             

    OBS    INVOICEN    BILLEDTO              AMTBILLE    BILLEDON     PAIDON

     1       12102     18543489           11063836.00    17NOV88           .
     2       11286     43459747           12679156.00    10OCT88           .
     3       12051     39045213         1340738760.90    02NOV88           .
     4       12471     39045213         1340738760.90    27DEC88           .
     5       12476     38763919           34891210.20    24DEC88           .
The first line of the DATA step uses the KEEP= data set option. This option works with view descriptors just as it works with other SAS data sets. That is, the KEEP= option specifies that you want only the listed variables to be included in the new data file, NOTPAID, although you can use the other variables within the DATA step.
Notice that the WHERE statement includes two conditions to be met. First, it selects only observations that have missing values for the variable PAIDON. As you can see, it is important to know how the ADABAS data is configured before you can use this data in a SAS program.
Second, the WHERE statement requires that the amount in each bill be higher than a certain figure. Again, you need to be familiar with the ADABAS data so that you can determine a reasonable figure for this expression.
When referencing a view descriptor in a SAS procedure or DATA step, it is more efficient to use a SAS WHERE statement than to use a subsetting IF statement. A DATA step or SAS procedure passes the SAS WHERE statement as a WHERE clause to the interface view engine, which adds it (using the Boolean operator AND) to any WHERE clause defined in the view descriptor. The view descriptor is then passed to ADABAS for processing. Processing ADABAS data using a WHERE clause might reduce the number of logical records read and therefore often improves performance.
For more information about the SAS WHERE statement, see the SAS Statements: Reference.

Selecting and Combining Data Using the SQL Procedure

Examples Using the SAS SQL Procedure

This section provides two examples of using the SAS SQL procedure on ADABAS data. The SQL procedure implements the Structured Query Language (SQL) and is included in Base SAS software. The first example illustrates using the SQL procedure to combine data from three sources. The second example shows how to use the PROC SQL GROUP BY clause to create new variables from data that is described by a view descriptor.

Combining Data from Various Sources

Suppose you have the view descriptors VLIB.CUSPHON and VLIB.CUSORDR based on the NATURAL DDMs CUSTOMERS and ORDER, respectively, and a SAS data file, MYDATA.OUTOFSTK, that contains names and numbers of products that are out of stock. You can use the SQL procedure to join all these sources of data to form a single output file. The SAS WHERE or subsetting IF statements would not be appropriate in this case because you want to compare variables from several sources, rather than simply merge or concatenate the data.
The following example contains the code to print the view descriptors and the SAS data file:
proc print data=vlib.cusphon;
   title "Data Described by VLIB.CUSPHON";
run;

proc print data=vlib.cusordr;
   title "Data Described by VLIB.CUSORDR";
run;

proc print data=mydata.outofstk;
   title "SAS Data File MYDATA.OUTOFSTK";
run;
The following three outputs show the results of the PRINT procedure performed on the data that is described by the view descriptors VLIB.CUSPHON and VLIB.CUSORDER and on the SAS data file MYDATA.OUTOFSTK.
Data That Is Described by the View Descriptor VLIB.CUSPHON
                         Data Described by VLIB.CUSPHON                        

         OBS    CUSTNUM     PHONE

           1    12345678    919/489-5682
           2    14324742    408/629-0589
           3    14569877    919/489-6792
           4    14898029    301/760-2541
           5    15432147    616/582-3906
           6    18543489    512/478-0788
           7    19783482    703/714-2900
           8    19876078    209/686-3953
           9    24589689    (012)736-202
          10    26422096    4268-54-72
          11    26984578    43-57-04
          12    27654351    02/215-37-32
          13    28710427    (021)570517
          14    29834248    (0552)715311
          15    31548901    406/422-3413
          16    38763919    244-6324
          17    39045213    012/302-1021
          18    43290587    (02)933-3212
          19    43459747    03/734-5111
          20    46543295    (03)022-2332
          21    46783280    3762855
          22    48345514    213445

         OBS    NAME

           1
           2    SANTA CLARA VALLEY TECHNOLOGY SPECIALISTS
           3    PRECISION PRODUCTS
           4    UNIVERSITY BIOMEDICAL MATERIALS
           5    GREAT LAKES LABORATORY EQUIPMENT MANUFACTURERS
           6    LONE STAR STATE RESEARCH SUPPLIERS
           7    TWENTY-FIRST CENTURY MATERIALS
           8    SAN JOAQUIN SCIENTIFIC AND INDUSTRIAL SUPPLY, INC.
           9    CENTAR ZA TECHNICKU I NAUCNU RESTAURIRANJE UMJETNINA
          10    SOCIETE DE RECHERCHES POUR DE CHIRURGIE ORTHOPEDIQUE
          11    INSTITUT FUR TEXTIL-FORSCHUNGS
          12    INSTITUT DE RECHERCHE SCIENTIFIQUE MEDICALE
          13    ANTONIE VAN LEEUWENHOEK VERENIGING VOOR MICROBIOLOGIE
          14    BRITISH MEDICAL RESEARCH AND SURGICAL SUPPLY
          15    NATIONAL COUNCIL FOR MATERIALS RESEARCH
          16    INSTITUTO DE BIOLOGIA Y MEDICINA NUCLEAR
          17    LABORATORIO DE PESQUISAS VETERNINARIAS DESIDERIO FINAMOR
          18    HASSEI SAIBO GAKKAI
          19    RESEARCH OUTFITTERS
          20    WESTERN TECHNOLOGICAL SUPPLY
          21    NGEE TECHNOLOGICAL INSTITUTE
          22    GULF SCIENTIFIC SUPPLIES
Data That Is Described by the View Descriptor VLIB.CUSORDR
                         Data Described by VLIB.CUSORDR                       

                             OBS    STOCKNUM    SHIPTO

                               1      9870      19876078
                               2      1279      39045213
                               3      8934      18543489
                               4      3478      29834248
                               5      2567      19783482
                               6      4789      15432147
                               7      3478      29834248
                               8      1279      14324742
                               9      8934      31548901
                              10      2567      14898029
                              11      9870      48345514
                              12      1279      39045213
                              13      8934      18543489
                              14      2567      19783482
                              15      9870      18543489
                              16      3478      24589689
                              17      1279      38763919
                              18      8934      43459747
                              19      2567      15432147
                              20      9870      14324742
                              21      9870      19876078
                              22      1279      39045213
                              23      8934      18543489
                              24      3478      29834248
                              25      2567      19783482
                              26      4789      15432147
                              27      3478      29834248
                              28      1279      14324742
                              29      8934      31548901
                              30      2567      14898029
                              31      9870      48345514
                              32      1279      39045213
                              33      8934      18543489
                              34      2567      19783482
                              35      9870      18543489
                              36      3478      24589689
                              37      1279      38763919
                              38      8934      43459747
                              39      2567      15432147
                              40      9870      14324742
Data in the SAS Data File MYDATA.OUTOFSTK
                 SAS Data File MYDATA.OUTOFSTK 

                  OBS    FIBERNAM    FIBERNUM

                   1      olefin       3478
                   2      gold         8934
                   3      dacron       4789
The following SAS code selects and combines data from these three sources to create a PROC SQL view, SQL.BADORDR. The SQL.BADORDR view retrieves customer and product information that the sales department can use to notify customers of unavailable products.
proc sql;
create view sql.badordr as
   select cusphon.custnum, cusphon.name, 
          cusphon.phone, cusordr.stocknum, 
          outofstk.fibernam as product
      from vlib.cusphon, vlib.cusordr, 
           mydata.outofstk
      where cusordr.stocknum=outofstk.fibernum
           and cusphon.custnum=cusordr.shipto
      order by cusphon.custnum, product;
   title "Data Described by SQL.BADORDR";
   select * from sql.badordr;
The CREATE VIEW statement incorporates a WHERE clause as part of its SELECT statement. The last SELECT statement retrieves and displays the PROC SQL view, SQL.BADORDR. To select all columns from the view, use an asterisk (*) in place of variable names. The order of the columns displayed matches the order of the columns as specified in the view descriptor SQL.BADORDR. (Note that an ORDER BY clause requires an ADABAS descriptor data field.)
The following output shows the data that is described by the SQL.BADORDR view. Note that the SQL procedure uses the column labels in the output by default.
Results of Combining Data from Various Sources
                         Data Described by SQL.BADORDR                        

     CUSTOMER  NAME
     TELEPHONE     STOCKNUM  PRODUCT
     ----------------------------------------------------------------------
     15432147  GREAT LAKES LABORATORY EQUIPMENT MANUFACTURERS
     616/582-3906      4789  dacron

     15432147  GREAT LAKES LABORATORY EQUIPMENT MANUFACTURERS
     616/582-3906      4789  dacron

     18543489  LONE STAR STATE RESEARCH SUPPLIERS
     512/478-0788      8934  gold

     18543489  LONE STAR STATE RESEARCH SUPPLIERS
     512/478-0788      8934  gold

     18543489  LONE STAR STATE RESEARCH SUPPLIERS
     512/478-0788      8934  gold

     18543489  LONE STAR STATE RESEARCH SUPPLIERS
     512/478-0788      8934  gold

     24589689  CENTAR ZA TECHNICKU I NAUCNU RESTAURIRANJE UMJETNINA
     (012)736-202      3478  olefin

     24589689  CENTAR ZA TECHNICKU I NAUCNU RESTAURIRANJE UMJETNINA
     (012)736-202      3478  olefin

     29834248  BRITISH MEDICAL RESEARCH AND SURGICAL SUPPLY
     (0552)715311      3478  olefin

     29834248  BRITISH MEDICAL RESEARCH AND SURGICAL SUPPLY
     (0552)715311      3478  olefin

     29834248  BRITISH MEDICAL RESEARCH AND SURGICAL SUPPLY
     (0552)715311      3478  olefin

     29834248  BRITISH MEDICAL RESEARCH AND SURGICAL SUPPLY
     (0552)715311      3478  olefin

     31548901  NATIONAL COUNCIL FOR MATERIALS RESEARCH
     406/422-3413      8934  gold

     31548901  NATIONAL COUNCIL FOR MATERIALS RESEARCH
     406/422-3413      8934  gold

     43459747  RESEARCH OUTFITTERS
     03/734-5111       8934  gold

     43459747  RESEARCH OUTFITTERS
     03/734-5111       8934  gold
The view SQL.BADORDR lists entries for all customers who have ordered out-of-stock products. However, it contains duplicate rows because some companies have ordered the same product more than once. To make the data more readable for the sales department, you can create a final SAS data file, MYDATA.BADNEWS, using the results of the PROC SQL view as input in the SET statement and the special variable FIRST.PRODUCT. This variable identifies which row is the first in a particular BY group. You need only a customer's name once to notify them that a product is out of stock, regardless of the number of times the customer has placed an order for it.
data mydata.badnews;
   set sql.badordr;
   by custnum product;
   if first.product;
run;

proc print;
   title "MYDATA.BADNEWS Data File";
quit;
The data file MYDATA.BADNEWS contains an observation for each unique combination of customer and out-of-stock product. The following output displays this data file.
Results of Grouping Data Using First.variable
                            MYDATA.BADNEWS Data File                          

 OBS   CUSTNUM     NAME

  1    15432147    GREAT LAKES LABORATORY EQUIPMENT MANUFACTURERS
  2    18543489    LONE STAR STATE RESEARCH SUPPLIERS
  3    24589689    CENTAR ZA TECHNICKU I NAUCNU RESTAURIRANJE UMJETNINA
  4    29834248    BRITISH MEDICAL RESEARCH AND SURGICAL SUPPLY
  5    31548901    NATIONAL COUNCIL FOR MATERIALS RESEARCH
  6    43459747    RESEARCH OUTFITTERS

 OBS   PHONE           STOCKNUM    PRODUCT

  1    616/582-3906      4789      dacron
  2    512/478-0788      8934      gold
  3    (012)736-202      3478      olefin
  4    (0552)715311      3478      olefin
  5    406/422-3413      8934      gold
  6    03/734-5111       8934      gold
For more information about FIRST.variable, see the SAS Statements: Reference.

Creating New Variables with the GROUP BY Clause

It is often useful to create new variables with summarizing or variable functions such as AVG or SUM. Although you cannot use the ACCESS procedure to create new variables, you can easily use the SQL procedure with data that is described by a view descriptor to display output that contains new variables.
This example uses the SQL procedure to retrieve and manipulate data accessed by the view descriptor VLIB.ALLEMP, which accesses data in the NATURAL DDM named EMPLOYEE. When this query (as a SELECT statement is often called) is submitted, it calculates and displays the average salary for each department; the AVG function is the SQL procedure's equivalent of the SAS MEAN function.
proc sql;
   title "Average Salary Per Department";
   select distinct dept,
         avg(salary) label="Average Salary" 
            format=dollar12.2
      from vlib.allemp
      where dept is not missing
      group by dept;
The order of the variables that are displayed matches the order of the variables as specified in the SELECT list of the query. The following output shows the query's result.
Results of Creating New Variables with the GROUP BY Clause
            Average Salary Per Department                    
                              Average
                 DEPT          Salary
                 --------------------
                 ACC013    $54,591.33
                 ACC024    $55,370.55
                 ACC043    $75,000.34
                 CSR004    $17,000.00
                 CSR010    $44,324.19
                 CSR011    $41,966.16
                 SHP002    $40,111.31
                 SHP013    $41,068.44
                 SHP024    $50,000.00
For more information about the SQL procedure, see the Base SAS Procedures Guide.