Selecting and Combining Data with the SQL Procedure

Using the WHERE Clause

Suppose you have two view descriptors, Vlib.EmpPos and Vlib.EmpEduc, that access employee positions and employee education, respectively. You can use PROC SQL to combine these files into a single SAS data file. The WHERE clause in SAS specifies that you want a data file that contains information about employees for whom the value for their level of education is missing, and who are in the department CORPORATION.
Note: PROC SQL displays the variable labels as stored in the view. However, because you are referencing a view descriptor, you must use the SAS variable names in the WHERE clause, not the SYSTEM 2000 item names.
The output shows the results from this example. (Notice that Waterhouse appears twice in the output. This is because he has two values for schema item C411 SCHOOL, but neither value has an associated value for C412 DEGREE/CERTIFICATE.)
  proc sql;
    title 'Corporation Positions With No Degrees';
    select emppos.lastname, position, degree, departme
       from vlib.emppos, vlib.empeduc
       where emppos.lastname=empeduc.lastname and
             empeduc.degree is missing and
             emppos.departme='CORPORATION'
       order by lastname;
Output from SQL Procedure with a WHERE Clause
                    Corporation Positions With No Degrees                  1
 
       LAST NAME   POSITION TITLE    DEGREE/CERTIFICATE  DEPARTMENT
       ----------------------------------------------------------------
       FAULKNER    SECRETARY                             CORPORATION
       KNIGHT      SECRETARY                             CORPORATION
       WATERHOUSE  PRESIDENT                             CORPORATION
       WATERHOUSE  PRESIDENT                             CORPORATION

Combining Data from Various Sources

Suppose that, along with the view descriptors Vlib.EmpPos and Vlib.EmpEduc, you have the SAS data file MyData.Classes that contains in-house continuing education classes taken by employees. You can use PROC SQL to join these sources of data to form a single output table of employee names, their departments, their degrees, and the in-house classes that they have taken. The output (Data Described by the View Descriptor Vlib.EmpPos, Data Described by the View Descriptor Vlib.EmpEduc, and SAS Data File MyData.Classes) shows the results of using PROC PRINT on the data described by Vlib.EmpPos and Vlib.EmpEduc and in the file MyData.Classes.
    proc print data=vlib.emppos;
       title2 'Data Described by VLIB.EMPPOS';
    run; 

    proc print data=vlib.empeduc;
       title2 'Data Described by VLIB.EMPEDUC';
    run; 

    proc print data=mydata.classes;
       title2 'SAS Data File MYDATA.CLASSES';
    run; 
Note: If you have many PROC SQL views and view descriptors, you might want to store the PROC SQL views in a separate SAS library from your view descriptors. They both have the member type VIEW, so you cannot tell a view descriptor from a PROC SQL view.
Data Described by the View Descriptor Vlib.EmpPos
                      Data Described by VLIB.EMPPOS                        1
 
 OBS  LASTNAME    FIRSTNME      POSITION            DEPARTME        MANAGER
 
   1                            PROGRAMMER          INFORMATION SY    MYJ
   2  AMEER       DAVID         SR SALES REPRESE    MARKETING         VPB
   3  AMEER       DAVID         JR SALES REPRESE    MARKETING         VPB
   4  BOWMAN      HUGH E.       EXECUTIVE VICE-P    CORPORATION       CPW
   5  BROOKS      RUBEN R.      JR SALES REPRESE    MARKETING         MAS
   6  BROWN       VIRGINA P.    MANAGER WESTERN     MARKETING         OMG
   7  CAHILL      JACOB         MANAGER SYSTEMS     INFORMATION SY    JBM
   8  CANADY      FRANK A.      MANAGER PERSONNE    ADMINISTRATION    PRK
   9  CHAN        TAI           SR SALES REPRESE    MARKETING         TZR
  10  COLLINS     LILLIAN       MAIL CLERK          ADMINISTRATION    SQT
  11  FAULKNER    CARRIE ANN    SECRETARY           CORPORATION       JBM
  12  FERNANDEZ   SOPHIA        STANDARDS & PROC    INFORMATION SY    JLH
  13  FREEMAN     LEOPOLD       SR SYSTEMS PROGR    INFORMATION SY    JLH
Data Described by the View Descriptor Vlib.EmpEduc
                   Data Described by VLIB.EMPEDUC                         1
 
       OBS    LASTNAME       FIRSTNME       SEX       DEGREE
 
         1
         2    AMEER          DAVID          MALE      BS
         3    BOWMAN         HUGH E.        MALE      MS
         4    BOWMAN         HUGH E.        MALE      BS
         5    BOWMAN         HUGH E.        MALE      PHD
         6    BROOKS         RUBEN R.       MALE      BS
         7    BROWN          VIRGINA P      FEMALE    BA
         8    CAHILL         JACOB          MALE      BS
         9    CAHILL         JACOB          MALE      BS
        10    CANADY         FRANK A.       MALE      MA
        11    CANADY         FRANK A.       MALE      BS
        12    CHAN           TAI            MALE      PHD
        13    CHAN           TAI            MALE      BA
SAS Data File MyData.Classes
                   SAS Data File MYDATA.CLASSES                          1
 
       OBS    LASTNAME      FIRSTNME      CLASS
 
         1    AMEER         DAVID         PRESENTING IDEAS
         2    CANADY        FRANK A.      PRESENTING IDEAS
         3    GIBSON        MOLLY I.      SUPERVISOR SKILLS
         4    GIBSON        MOLLY I.      STRESS MGMT
         5    RICHARDSON    TRAVIS Z.     SUPERVISOR SKILLS
The following SAS program selects and combines data from these three sources (the two view descriptors and the SAS data file) to create the view SQL.EDUC. This view retrieves employee names, their departments, their levels of education, and the in-house classes that they have taken.
In the following program, the CREATE VIEW statement incorporates a WHERE clause as part of the SELECT statement. The last SELECT statement retrieves and displays the PROC SQL view SQL.EDUC. To select all items from the view, an asterisk (*) is used in place of item names. The order of the items that are displayed matches the order of the items as they are specified in the first SELECT clause.
The output following the program shows the data described by the SQL.EDUC view. PROC SQL uses variable labels in the output by default.
 proc sql;
  create view sql.educ as
    select emppos.lastname, emppos.firstnme, 
      emppos.departme, empeduc.degree,
      classes.class as course
      from vlib.emppos,
           vlib.empeduc, 
           mydata.classes
      where (emppos.lastname=empeduc.lastname 
             and emppos.firstnme=empeduc.firstnme) 
             and
            (empeduc.lastname=classes.lastname 
             and empeduc.firstnme=classes.firstnme)
      order by emppos.lastname, course;
 
    title 'Data Described by SQL.EDUC';
    select * from sql.educ;
Data Described by the PROC SQL View SQL.EDUC
                         Data Described by SQL.EDUC                         1
 
        LAST NAME   FORENAME      DEPARTMENT      DEGREE/CERTIFICATE
        COURSE
        ------------------------------------------------------------
        AMEER       DAVID         MARKETING       BS
        PRESENTING IDEAS
 
        AMEER       DAVID         MARKETING       BS
        PRESENTING IDEAS
 
        CANADY      FRANK A.      ADMINISTRATION  MA
        PRESENTING IDEAS
 
        CANADY      FRANK A.      ADMINISTRATION  BS
        PRESENTING IDEAS
 
        GIBSON      MOLLY I.      INFORMATION SY  BA
        STRESS MGMT
 
        GIBSON      MOLLY I.      INFORMATION SY  BA
        SUPERVISOR SKILLS
 
        RICHARDSON  TRAVIS Z.     MARKETING       BS
        SUPERVISOR SKILLS
The view SQL.EDUC lists entries for employees who have taken in-house classes, their departments, and their degrees. However, it contains duplicate observations because some employees have more than one degree and have taken more than one in-house class. To make the data more readable, you can create the final SAS data file MyData.Update by using the SET statement and the special variable FIRST. This variable identifies which observation is the first in a specific BY group. You need each employee's name associated only one time with the degrees and in-house education classes that were taken, regardless of the number of degrees or the number of classes taken.
The output that follows the program displays the data file MyData.Update that contains an observation for each unique combination of employee, degree, and in-house class.
    data mydata.update;
       set sql.educ;
       by lastname course;
       if first.lastname then output;
    run; 
 
    proc print;
       title2 'MYDATA.UPDATE Data File';
    run; 
SAS Data File MyData.Update
                              MYDATA.UPDATE Data File                        1
 
  OBS    LASTNAME      FIRSTNME     DEPARTME        DEGREE    COURSE
 
    1    AMEER         DAVID        MARKETING       BS        PRESENTING IDEAS
    2    CANADY        FRANK A.     ADMINISTRATION  MA        PRESENTING IDEAS
    3    GIBSON        MOLLY I.     INFORMATION SY  BA        STRESS MGMT
    4    RICHARDSON    TRAVIS Z.    MARKETING       BS        SUPERVISOR
SKILLS

Creating New Items with the GROUP BY Clause in PROC SQL

It is often useful to create new items with summary or aggregate functions such as AVG or SUM. You can easily use PROC SQL with data described by a view descriptor to display output that contains new items.
The following program uses PROC SQL to retrieve and manipulate data from the view descriptor Vlib.EmpVac. When this query (as a SELECT statement is often called) is submitted, it calculates and displays the average vacation time (in hours) for each department. The order of the items displayed matches the order of the items as specified in the SELECT clause of the query. The output for follows the program shows the results from using the SELECT statement.
    proc sql;
       title 'Average Vacation Per Department';
       select distinct departme,
             avg(accruedv) label='Avg Vac'
          from vlib.empvac
          where departme is not missing
          group by departme;
Data Retrieved by a PROC SQL Query
             Average Vacation Per Department         
 
                   DEPARTMENT       Avg 
                ------------------------
                ADMINISTRATION        43
                CORPORATION     40.72727
                INFORMATION SY     61.75
                MARKETING       47.61905
For more information about the SQL procedure, see the SAS SQL Procedure User's Guide.