SQL Procedure

Example 7: Performing an Outer Join

Features:

joined-table component

left outer join

SELECT clause : COALESCE function

WHERE clause : CONTAINS condition

Table names: PROCLIB.PAYROLL

PROCLIB.PAYROLL2

Details

This example illustrates a left outer join of the PROCLIB.PAYROLL and PROCLIB.PAYROLL2 tables.
proc sql outobs=10;
   title 'PROCLIB.PAYROLL';
   title2 'First 10 Rows Only';
   select * from proclib.payroll
   order by idnumber;
   title;
PROCLIB.PAYROLL
PROCLIB.PAYROLL Table
proc sql;
   title 'PROCLIB.PAYROLL2';
   select * from proclib.payroll2
    order by idnum;
   title;
PROCLIB.PAYROLL2
PROCLIB.PAYROLL2 Table

Program Using OUTER JOIN Based on ID Number

libname proclib 'SAS-library';
proc sql outobs=10;
title 'Most Current Jobcode and Salary Information';
select p.IdNumber, p.Jobcode, p.Salary,
          p2.jobcode label='New Jobcode',
          p2.salary label='New Salary' format=dollar8.
from proclib.payroll as p left join proclib.payroll2 as p2
on p.IdNumber=p2.idnum;

Program Description

Declare the PROCLIB library.The PROCLIB library is used in these examples to store created tables.
libname proclib 'SAS-library';
Limit the number of output rows.OUTOBS= limits the output to 10 rows.
proc sql outobs=10;
Specify the title for the first query.
title 'Most Current Jobcode and Salary Information';
Select the columns.The SELECT clause lists the columns to select. Some column names are prefixed with a table alias because they are in both tables. LABEL= and FORMAT= are column modifiers.
select p.IdNumber, p.Jobcode, p.Salary,
          p2.jobcode label='New Jobcode',
          p2.salary label='New Salary' format=dollar8.
Specify the type of join.The FROM clause lists the tables to join and assigns table aliases. The keywords LEFT JOIN specify the type of join. The order of the tables in the FROM clause is important. PROCLIB.PAYROLL is listed first and is considered the “left” table. PROCLIB.PAYROLL2 is the “right” table.
from proclib.payroll as p left join proclib.payroll2 as p2
Specify the join criterion.The ON clause specifies that the join be performed based on the values of the ID numbers from each table.
on p.IdNumber=p2.idnum;

Output for OUTER JOIN

As the output shows, all rows from the left table, PROCLIB.PAYROLL, are returned. PROC SQL assigns missing values for rows in the left table, PAYROLL, that have no matching values for IdNum in PAYROLL2.
Most Current Jobcode and Salary Information

Program Using COALESCE and LEFT JOIN

proc sql outobs=10;
title 'Most Current Jobcode and Salary Information';
select p.idnumber, coalesce(p2.jobcode,p.jobcode)
        label='Current Jobcode',
coalesce(p2.salary,p.salary) label='Current Salary'
                  format=dollar8.
from proclib.payroll p left join proclib.payroll2 p2
      on p.IdNumber=p2.idnum;

Program Description

proc sql outobs=10;
Specify the title for the second query.
title 'Most Current Jobcode and Salary Information';
Select the columns and coalesce the Jobcode columns.The SELECT clause lists the columns to select. COALESCE overlays the like-named columns. For each row, COALESCE returns the first nonmissing value of either P2.JOBCODE or P.JOBCODE. Because P2.JOBCODE is the first argument, if there is a nonmissing value for P2.JOBCODE, COALESCE returns that value. Thus, the output contains the most recent job code information for every employee. LABEL= assigns a column label.
select p.idnumber, coalesce(p2.jobcode,p.jobcode)
        label='Current Jobcode',
Coalesce the Salary columns.For each row, COALESCE returns the first nonmissing value of either P2.SALARY or P.SALARY. Because P2.SALARY is the first argument, if there is a nonmissing value for P2.SALARY, then COALESCE returns that value. Thus, the output contains the most recent salary information for every employee.
coalesce(p2.salary,p.salary) label='Current Salary'
                  format=dollar8.
Specify the type of join and the join criterion.The FROM clause lists the tables to join and assigns table aliases. The keywords LEFT JOIN specify the type of join. The ON clause specifies that the join is based on the ID numbers from each table.
from proclib.payroll p left join proclib.payroll2 p2
      on p.IdNumber=p2.idnum;

Output for COALESCE and LEFT JOIN

Most Current Jobcode and Salary Information
Most Current Jobcode and Salary Information

Program to Subset the Query

proc sql;

title 'Most Current Information for Ticket Agents';
   select p.IdNumber,
          coalesce(p2.jobcode,p.jobcode) label='Current Jobcode',
          coalesce(p2.salary,p.salary) label='Current Salary'
      from proclib.payroll p left join proclib.payroll2 p2
      on p.IdNumber=p2.idnum
      where p2.jobcode contains 'TA';

Program Description

Subset the query.The WHERE clause subsets the left join to include only those rows containing the value TA.
proc sql;

title 'Most Current Information for Ticket Agents';
   select p.IdNumber,
          coalesce(p2.jobcode,p.jobcode) label='Current Jobcode',
          coalesce(p2.salary,p.salary) label='Current Salary'
      from proclib.payroll p left join proclib.payroll2 p2
      on p.IdNumber=p2.idnum
      where p2.jobcode contains 'TA';

Output for Subset of the Query

Query Results with the Value TA
Most Current Information for Ticket Agents