ADABAS Data in SAS Programs |
Methods for Selecting and Combining ADABAS Data |
The great majority of SAS programs select and combine data from various sources. The method you use depends on the configuration of the data. The next three examples show you how to select and combine data using two different methods. When choosing between these methods, you should consider the issues described in Performance Considerations.
Selecting and Combining Data Using the WHERE Statement |
Suppose you have two view descriptors, VLIB.USAINV and VLIB.FORINV, that list the invoices for USA and foreign customers, respectively. You can use the SET statement to concatenate these files into a SAS data file containing information about customers who have not paid their bills and whose bills amount to at least $300,000.
The following example contains the code to create the SAS data file containing the information you want on the customers.
data notpaid(keep=invoicen billedto amtbille billedon paidon); set vlib.usainv vlib.forinv; where paidon is missing and amtbille>=300000; run; proc print; title "High Bills--Not Paid"; run;
In the SAS WHERE statement, you must use the SAS variable names, not the ADABAS data field names. Both VLIB.USAINV and VLIB.FORINV access data in the NATURAL DDM named INVOICE. The following output shows the result of the new temporary data file, WORK.NOTPAID.
Results of Selecting and Combining Data Using a WHERE statement
High Bills--Not Paid OBS INVOICEN BILLEDTO AMTBILLE BILLEDON PAIDON 1 12102 18543489 11063836.00 17NOV88 . 2 11286 43459747 12679156.00 10OCT88 . 3 12051 39045213 1340738760.90 02NOV88 . 4 12471 39045213 1340738760.90 27DEC88 . 5 12476 38763919 34891210.20 24DEC88 .
The first line of the DATA step uses the KEEP= data set option. This option works with view descriptors just as it works with other SAS data sets; that is, the KEEP= option specifies that you want only the listed variables to be included in the new data file, NOTPAID, although you can use the other variables within the DATA step.
Notice that the WHERE statement includes two conditions to be met. First, it selects only observations that have missing values for the variable PAIDON. As you can see, it is important to know how the ADABAS data is configured before you can use this data in a SAS program.
Second, the WHERE statement requires that the amount in each bill be higher than a certain figure. Again, you need to be familiar with the ADABAS data so that you can determine a reasonable figure for this expression.
When referencing a view descriptor in a SAS procedure or DATA step, it is more efficient to use a SAS WHERE statement than to use a subsetting IF statement. A DATA step or SAS procedure passes the SAS WHERE statement as a WHERE clause to the interface view engine, which adds it (using the Boolean operator AND) to any WHERE clause defined in the view descriptor. The view descriptor is then passed to ADABAS for processing. Processing ADABAS data using a WHERE clause might reduce the number of logical records read and therefore often improves performance.
For more information about the SAS WHERE statement, see the SAS Language Reference: Dictionary.
Selecting and Combining Data Using the SQL Procedure |
This section provides two examples of using the SAS SQL procedure on ADABAS data. The SQL procedure implements the Structured Query Language (SQL) and is included in Base SAS software. The first example illustrates using the SQL procedure to combine data from three sources. The second example shows how to use the PROC SQL GROUP BY clause to create new variables from data that is described by a view descriptor.
Suppose you have the view descriptors VLIB.CUSPHON and VLIB.CUSORDR based on the NATURAL DDMs CUSTOMERS and ORDER, respectively, and a SAS data file, MYDATA.OUTOFSTK, that contains names and numbers of products that are out of stock. You can use the SQL procedure to join all these sources of data to form a single output file. The SAS WHERE or subsetting IF statements would not be appropriate in this case because you want to compare variables from several sources, rather than simply merge or concatenate the data.
The following example contains the code to print the view descriptors and the SAS data file:
proc print data=vlib.cusphon; title "Data Described by VLIB.CUSPHON"; run; proc print data=vlib.cusordr; title "Data Described by VLIB.CUSORDR"; run; proc print data=mydata.outofstk; title "SAS Data File MYDATA.OUTOFSTK"; run;
The following three outputs show the results of the PRINT procedure performed on the data that is described by the view descriptors VLIB.CUSPHON and VLIB.CUSORDER and on the SAS data file MYDATA.OUTOFSTK.
Data That is Described by the View Descriptor VLIB.CUSPHON
Data Described by VLIB.CUSPHON OBS CUSTNUM PHONE 1 12345678 919/489-5682 2 14324742 408/629-0589 3 14569877 919/489-6792 4 14898029 301/760-2541 5 15432147 616/582-3906 6 18543489 512/478-0788 7 19783482 703/714-2900 8 19876078 209/686-3953 9 24589689 (012)736-202 10 26422096 4268-54-72 11 26984578 43-57-04 12 27654351 02/215-37-32 13 28710427 (021)570517 14 29834248 (0552)715311 15 31548901 406/422-3413 16 38763919 244-6324 17 39045213 012/302-1021 18 43290587 (02)933-3212 19 43459747 03/734-5111 20 46543295 (03)022-2332 21 46783280 3762855 22 48345514 213445 OBS NAME 1 2 SANTA CLARA VALLEY TECHNOLOGY SPECIALISTS 3 PRECISION PRODUCTS 4 UNIVERSITY BIOMEDICAL MATERIALS 5 GREAT LAKES LABORATORY EQUIPMENT MANUFACTURERS 6 LONE STAR STATE RESEARCH SUPPLIERS 7 TWENTY-FIRST CENTURY MATERIALS 8 SAN JOAQUIN SCIENTIFIC AND INDUSTRIAL SUPPLY, INC. 9 CENTAR ZA TECHNICKU I NAUCNU RESTAURIRANJE UMJETNINA 10 SOCIETE DE RECHERCHES POUR DE CHIRURGIE ORTHOPEDIQUE 11 INSTITUT FUR TEXTIL-FORSCHUNGS 12 INSTITUT DE RECHERCHE SCIENTIFIQUE MEDICALE 13 ANTONIE VAN LEEUWENHOEK VERENIGING VOOR MICROBIOLOGIE 14 BRITISH MEDICAL RESEARCH AND SURGICAL SUPPLY 15 NATIONAL COUNCIL FOR MATERIALS RESEARCH 16 INSTITUTO DE BIOLOGIA Y MEDICINA NUCLEAR 17 LABORATORIO DE PESQUISAS VETERNINARIAS DESIDERIO FINAMOR 18 HASSEI SAIBO GAKKAI 19 RESEARCH OUTFITTERS 20 WESTERN TECHNOLOGICAL SUPPLY 21 NGEE TECHNOLOGICAL INSTITUTE 22 GULF SCIENTIFIC SUPPLIES
Data That is Described by the View Descriptor VLIB.CUSORDR
Data Described by VLIB.CUSORDR OBS STOCKNUM SHIPTO 1 9870 19876078 2 1279 39045213 3 8934 18543489 4 3478 29834248 5 2567 19783482 6 4789 15432147 7 3478 29834248 8 1279 14324742 9 8934 31548901 10 2567 14898029 11 9870 48345514 12 1279 39045213 13 8934 18543489 14 2567 19783482 15 9870 18543489 16 3478 24589689 17 1279 38763919 18 8934 43459747 19 2567 15432147 20 9870 14324742 21 9870 19876078 22 1279 39045213 23 8934 18543489 24 3478 29834248 25 2567 19783482 26 4789 15432147 27 3478 29834248 28 1279 14324742 29 8934 31548901 30 2567 14898029 31 9870 48345514 32 1279 39045213 33 8934 18543489 34 2567 19783482 35 9870 18543489 36 3478 24589689 37 1279 38763919 38 8934 43459747 39 2567 15432147 40 9870 14324742
Data in the SAS Data File MYDATA.OUTOFSTK
SAS Data File MYDATA.OUTOFSTK OBS FIBERNAM FIBERNUM 1 olefin 3478 2 gold 8934 3 dacron 4789
The following SAS code selects and combines data from these three sources to create a PROC SQL view, SQL.BADORDR. The SQL.BADORDR view retrieves customer and product information that the sales department can use to notify customers of unavailable products.
proc sql; create view sql.badordr as select cusphon.custnum, cusphon.name, cusphon.phone, cusordr.stocknum, outofstk.fibernam as product from vlib.cusphon, vlib.cusordr, mydata.outofstk where cusordr.stocknum=outofstk.fibernum and cusphon.custnum=cusordr.shipto order by cusphon.custnum, product; title "Data Described by SQL.BADORDR"; select * from sql.badordr;
The CREATE VIEW statement incorporates a WHERE clause as part of its SELECT statement. The last SELECT statement retrieves and displays the PROC SQL view, SQL.BADORDR. To select all columns from the view, use an asterisk (*) in place of variable names. The order of the columns displayed matches the order of the columns as specified in the view descriptor SQL.BADORDR. (Note that an ORDER BY clause requires an ADABAS descriptor data field.)
The following output shows the data that is described by the SQL.BADORDR view. Note that the SQL procedure uses the column labels in the output by default.
Results of Combining Data from Various Sources
Data Described by SQL.BADORDR CUSTOMER NAME TELEPHONE STOCKNUM PRODUCT ---------------------------------------------------------------------- 15432147 GREAT LAKES LABORATORY EQUIPMENT MANUFACTURERS 616/582-3906 4789 dacron 15432147 GREAT LAKES LABORATORY EQUIPMENT MANUFACTURERS 616/582-3906 4789 dacron 18543489 LONE STAR STATE RESEARCH SUPPLIERS 512/478-0788 8934 gold 18543489 LONE STAR STATE RESEARCH SUPPLIERS 512/478-0788 8934 gold 18543489 LONE STAR STATE RESEARCH SUPPLIERS 512/478-0788 8934 gold 18543489 LONE STAR STATE RESEARCH SUPPLIERS 512/478-0788 8934 gold 24589689 CENTAR ZA TECHNICKU I NAUCNU RESTAURIRANJE UMJETNINA (012)736-202 3478 olefin 24589689 CENTAR ZA TECHNICKU I NAUCNU RESTAURIRANJE UMJETNINA (012)736-202 3478 olefin 29834248 BRITISH MEDICAL RESEARCH AND SURGICAL SUPPLY (0552)715311 3478 olefin 29834248 BRITISH MEDICAL RESEARCH AND SURGICAL SUPPLY (0552)715311 3478 olefin 29834248 BRITISH MEDICAL RESEARCH AND SURGICAL SUPPLY (0552)715311 3478 olefin 29834248 BRITISH MEDICAL RESEARCH AND SURGICAL SUPPLY (0552)715311 3478 olefin 31548901 NATIONAL COUNCIL FOR MATERIALS RESEARCH 406/422-3413 8934 gold 31548901 NATIONAL COUNCIL FOR MATERIALS RESEARCH 406/422-3413 8934 gold 43459747 RESEARCH OUTFITTERS 03/734-5111 8934 gold 43459747 RESEARCH OUTFITTERS 03/734-5111 8934 gold
The view SQL.BADORDR lists entries for all customers who have ordered out-of-stock products. However, it contains duplicate rows because some companies have ordered the same product more than once. To make the data more readable for the sales department, you can create a final SAS data file, MYDATA.BADNEWS, using the results of the PROC SQL view as input in the SET statement and the special variable FIRST.PRODUCT. This variable identifies which row is the first in a particular BY group. You only need a customer's name once to notify them that a product is out of stock, regardless of the number of times the customer has placed an order for it.
data mydata.badnews; set sql.badordr; by custnum product; if first.product; run; proc print; title "MYDATA.BADNEWS Data File"; quit;
The data file MYDATA.BADNEWS contains an observation for each unique combination of customer and out-of-stock product. The following output displays this data file.
Results of Grouping Data Using First.variable
MYDATA.BADNEWS Data File OBS CUSTNUM NAME 1 15432147 GREAT LAKES LABORATORY EQUIPMENT MANUFACTURERS 2 18543489 LONE STAR STATE RESEARCH SUPPLIERS 3 24589689 CENTAR ZA TECHNICKU I NAUCNU RESTAURIRANJE UMJETNINA 4 29834248 BRITISH MEDICAL RESEARCH AND SURGICAL SUPPLY 5 31548901 NATIONAL COUNCIL FOR MATERIALS RESEARCH 6 43459747 RESEARCH OUTFITTERS OBS PHONE STOCKNUM PRODUCT 1 616/582-3906 4789 dacron 2 512/478-0788 8934 gold 3 (012)736-202 3478 olefin 4 (0552)715311 3478 olefin 5 406/422-3413 8934 gold 6 03/734-5111 8934 gold
For more information about FIRST.variable, see the SAS Language Reference: Dictionary.
It is often useful to create new variables with summarizing or variable functions such as AVG or SUM. Although you cannot use the ACCESS procedure to create new variables, you can easily use the SQL procedure with data that is described by a view descriptor to display output that contains new variables.
This example uses the SQL procedure to retrieve and manipulate data accessed by the view descriptor VLIB.ALLEMP, which accesses data in the NATURAL DDM named EMPLOYEE. When this query (as a SELECT statement is often called) is submitted, it calculates and displays the average salary for each department; the AVG function is the SQL procedure's equivalent of the SAS MEAN function.
proc sql; title "Average Salary Per Department"; select distinct dept, avg(salary) label="Average Salary" format=dollar12.2 from vlib.allemp where dept is not missing group by dept;
The order of the variables that are displayed matches the order of the variables as specified in the SELECT list of the query. The following output shows the query's result.
Results of Creating New Variables With the GROUP BY Clause
Average Salary Per Department Average DEPT Salary -------------------- ACC013 $54,591.33 ACC024 $55,370.55 ACC043 $75,000.34 CSR004 $17,000.00 CSR010 $44,324.19 CSR011 $41,966.16 SHP002 $40,111.31 SHP013 $41,068.44 SHP024 $50,000.00
For more information about the SQL procedure, see the SQL section in the Base SAS Procedures Guide.
Copyright © 2007 by SAS Institute Inc., Cary, NC, USA. All rights reserved.