Previous Page | Next Page

The SQL Procedure

sql-expression


Produces a value from a sequence of operands and operators.
operand operator operand

Arguments

operand

is one of the following:

  • a constant, which is a number or a quoted character string (or other special notation) that indicates a fixed value. Constants are also called literals. Constants are described in SAS Language Reference: Dictionary.

  • a column-name, which is described in column-name.

  • a CASE expression, which is described in CASE expression.

  • any supported SAS function. PROC SQL supports many of the functions available to the SAS DATA step. Some of the functions that aren't supported are the variable information functions, functions that work with arrays of data, and functions that operate on rows other than the current row. Other SQL databases support their own sets of functions. Functions are described in the SAS Language Reference: Dictionary.

  • any functions, except those with array elements, that are created with PROC FCMP.

  • the ANSI SQL functions COALESCE, BTRIM, LOWER, UPPER, and SUBSTRING.

  • a summary-function, which is described in summary-function.

  • a query-expression, which is described in query-expression.

  • the USER literal, which references the userid of the person who submitted the program. The userid that is returned is operating environment-dependent, but PROC SQL uses the same value that the &SYSJOBID macro variable has on the operating environment.

operator

is described in Operators and the Order of Evaluation.

Note:   SAS functions, including summary functions, can stand alone as SQL expressions. For example

select min(x) from table;

select scan(y,4) from table;

  [cautionend]


SAS Functions

PROC SQL supports many of the functions available to the SAS DATA step. Some of the functions that aren't supported are the variable information functions and functions that work with arrays of data. Other SQL databases support their own sets of functions. For example, the SCAN function is used in the following query:

    select style, scan(street,1) format=$15.
       from houses;

PROC SQL also supports any user-written functions, except those functions with array elements, that are created using PROC FCMP.

See the SAS Language Reference: Dictionary for complete documentation of SAS functions. Summary functions are also SAS functions. See summary-function for more information.


USER Literal

USER can be specified in a view definition, for example, to create a view that restricts access to the views in the user's department. Note that the USER literal value is stored in uppercase, so it is advisable to use the UPCASE function when comparing to this value:

   create view myemp as
      select * from dept12.employees
         where upcase(manager)=user;

This view produces a different set of employee information for each manager who references it.


Operators and the Order of Evaluation

The order in which operations are evaluated is the same as in the DATA step with this one exception: NOT is grouped with the logical operators AND and OR in PROC SQL; in the DATA step, NOT is grouped with the unary plus and minus signs.

Unlike missing values in some versions of SQL, missing values in SAS always appear first in the collating sequence. Therefore, in Boolean and comparison operations, the following expressions resolve to true in a predicate:

   3>null
  -3>null
   0>null

You can use parentheses to group values or to nest mathematical expressions. Parentheses make expressions easier to read and can also be used to change the order of evaluation of the operators. Evaluating expressions with parentheses begins at the deepest level of parentheses and moves outward. For example, SAS evaluates A+B*C as A+(B*C), although you can add parentheses to make it evaluate as (A+B)*C for a different result.

Higher priority operations are performed first: that is, group 0 operators are evaluated before group 5 operators. The following table shows the operators and their order of evaluation, including their priority groups.

Operators and Order of Evaluation
Group Operator Description
0 ( ) forces the expression enclosed to be evaluated first
1 case-expression selects result values that satisfy specified conditions
2 ** raises to a power

unary +, unary - indicates a positive or negative number
3 * multiplies

/ divides
4 + adds

- subtracts
5 || concatenates
6 <NOT> BETWEEN condition See BETWEEN condition.

<NOT> CONTAINS condition see CONTAINS condition.

<NOT> EXISTS condition See EXISTS condition.

<NOT> IN condition See IN condition.

IS <NOT> condition See IS condition.

<NOT> LIKE condition See LIKE condition.
7 =, eq equals

¬=, ^=, < >, ne does not equal

>, gt is greater than

<, lt is less than

>=, ge is greater than or equal to

<=, le is less than or equal to

=* sounds like (use with character operands only). See Retrieving Values with the SOUNDS-LIKE Operator.

eqt equal to truncated strings (use with character operands only). See Truncated String Comparison Operators.

gtt greater than truncated strings

ltt less than truncated strings

get greater than or equal to truncated strings

let less than or equal to truncated strings

net not equal to truncated strings
8 ¬, ^, NOT indicates logical NOT
9 &, AND indicates logical AND
10 |, OR indicates logical OR

Symbols for operators might vary, depending on your operating environment. See SAS Language Reference: Dictionary for more information on operators and expressions.


Truncated String Comparison Operators

PROC SQL supports truncated string comparison operators. (See Group 7 in Operators and Order of Evaluation.) In a truncated string comparison, the comparison is performed after making the strings the same length by truncating the longer string to be the same length as the shorter string. For example, the expression 'TWOSTORY' eqt 'TWO' is true because the string 'TWOSTORY' is reduced to 'TWO' before the comparison is performed. Note that the truncation is performed internally; neither operand is permanently changed.

Note:   Unlike the DATA step, PROC SQL does not support the colon operators (such as =:, >:, and <=:) for truncated string comparisons. Use the alphabetic operators (such as EQT, GTT, and LET).  [cautionend]


Query Expressions (Subqueries)

A query-expression is called a subquery when it is used in a WHERE or HAVING clause. A subquery is a query-expression that is nested as part of another query-expression. A subquery selects one or more rows from a table based on values in another table.

Depending on the clause that contains it, a subquery can return a single value or multiple values. If more than one subquery is used in a query-expression, then the innermost query is evaluated first, then the next innermost query, and so on, moving outward.

PROC SQL allows a subquery (contained in parentheses) at any point in an expression where a simple column value or constant can be used. In this case, a subquery must return a single value, that is, one row with only one column.

The following is an example of a subquery that returns one value. This PROC SQL step subsets the PROCLIB.PAYROLL table based on information in the PROCLIB.STAFF table. (PROCLIB.PAYROLL is shown in Creating a Table from a Query's Result, and PROCLIB.STAFF is shown in Joining Two Tables.) PROCLIB.PAYROLL contains employee identification numbers (IdNumber) and their salaries (Salary) but does not contain their names. If you want to return only the row from PROCLIB.PAYROLL for one employee, then you can use a subquery that queries the PROCLIB.STAFF table, which contains the employees' identification numbers and their names (Lname and Fname).

options ls=64 nodate nonumber;
proc sql;
    title 'Information for Earl Bowden';
    select *
       from proclib.payroll
       where idnumber=
             (select idnum
             from proclib.staff
             where upcase(lname)='BOWDEN');

Query Output - One Value

                  Information for Earl Bowden

      Id
      Number  Gender  Jobcode    Salary    Birth    Hired
      ---------------------------------------------------
      1403    M       ME1         28072  28JAN69  21DEC91

Subqueries can return multiple values. The following example uses the tables PROCLIB.DELAY and PROCLIB.MARCH. These tables contain information about the same flights and have the Flight column in common. The following subquery returns all the values for Flight in PROCLIB.DELAY for international flights. The values from the subquery complete the WHERE clause in the outer query. Thus, when the outer query is executed, only the international flights from PROCLIB.MARCH are in the output.

options ls=64 nodate nonumber;
proc sql outobs=5;
   title 'International Flights from';
   title2 'PROCLIB.MARCH';
   select Flight, Date, Dest, Boarded
      from proclib.march
      where flight in 
            (select flight
             from proclib.delay
             where destype='International');

Query Output - Multiple Values

                   International Flights from
                         PROCLIB.MARCH

                Flight     Date  Dest   Boarded
                -------------------------------
                219     01MAR94  LON        198
                622     01MAR94  FRA        207
                132     01MAR94  YYZ        115
                271     01MAR94  PAR        138
                219     02MAR94  LON        147

Sometimes it is helpful to compare a value with a set of values returned by a subquery. The keywords ANY or ALL can be specified before a subquery when the subquery is the right-hand operand of a comparison. If ALL is specified, then the comparison is true only if it is true for all values that are returned by the subquery. If a subquery returns no rows, then the result of an ALL comparison is true for each row of the outer query.

If ANY is specified, then the comparison is true if it is true for any one of the values that are returned by the subquery. If a subquery returns no rows, then the result of an ANY comparison is false for each row of the outer query.

The following example selects all of the employees in PROCLIB.PAYROLL who earn more than the highest paid ME3:

options ls=64 nodate nonumber ;
proc sql;
title "Employees who Earn More than";
title2 "All ME's";
   select *
     from proclib.payroll
     where salary > all (select salary
                         from proclib.payroll
                         where jobcode='ME3');

Query Output Using ALL Comparison

                  Employees who Earn More than
                            All ME's

      Id
      Number  Gender  Jobcode    Salary    Birth    Hired
      ---------------------------------------------------
      1333    M       PT2         88606  30MAR61  10FEB81
      1739    M       PT1         66517  25DEC64  27JAN91
      1428    F       PT1         68767  04APR60  16NOV91
      1404    M       PT2         91376  24FEB53  01JAN80
      1935    F       NA2         51081  28MAR54  16OCT81
      1905    M       PT1         65111  16APR72  29MAY92
      1407    M       PT1         68096  23MAR69  18MAR90
      1410    M       PT2         84685  03MAY67  07NOV86
      1439    F       PT1         70736  06MAR64  10SEP90
      1545    M       PT1         66130  12AUG59  29MAY90
      1106    M       PT2         89632  06NOV57  16AUG84
      1442    F       PT2         84536  05SEP66  12APR88
      1417    M       NA2         52270  27JUN64  07MAR89
      1478    M       PT2         84203  09AUG59  24OCT90
      1556    M       PT1         71349  22JUN64  11DEC91
      1352    M       NA2         53798  02DEC60  16OCT86
      1890    M       PT2         91908  20JUL51  25NOV79
      1107    M       PT2         89977  09JUN54  10FEB79
      1830    F       PT2         84471  27MAY57  29JAN83
      1928    M       PT2         89858  16SEP54  13JUL90
      1076    M       PT1         66558  14OCT55  03OCT91

Note:   See the first item in Subqueries and Efficiency for a note about efficiency when using ALL.  [cautionend]

In order to visually separate a subquery from the rest of the query, you can enclose the subquery in any number of pairs of parentheses.


Correlated Subqueries

In a correlated subquery, the WHERE expression in a subquery refers to values in a table in the outer query. The correlated subquery is evaluated for each row in the outer query. With correlated subqueries, PROC SQL executes the subquery and the outer query together.

The following example uses the PROCLIB.DELAY and PROCLIB.MARCH tables. A DATA step (PROCLIB.DELAY) creates PROCLIB.DELAY. PROCLIB.MARCH is shown in Producing All the Possible Combinations of the Values in a Column. PROCLIB.DELAY has the Flight, Date, Orig, and Dest columns in common with PROCLIB.MARCH:

proc sql outobs=5;
   title 'International Flights';
   select *
      from proclib.march
      where 'International' in 
           (select destype
            from proclib.delay
            where march.Flight=delay.Flight);

The subquery resolves by substituting every value for MARCH.Flight into the subquery's WHERE clause, one row at a time. For example, when MARCH.Flight=219 , the subquery resolves as follows:

  1. PROC SQL retrieves all the rows from DELAY where Flight=219 and passes their DESTYPE values to the WHERE clause.

  2. PROC SQL uses the DESTYPE values to complete the WHERE clause:

    where 'International' in
       ('International','International', ...)

  3. The WHERE clause checks to determine whether International is in the list. Because it is, all rows from MARCH that have a value of 219 for Flight become part of the output.

The following output contains the rows from MARCH for international flights only.

Correlated Subquery Output

                             International Flights

       Flight     Date  Depart  Orig  Dest     Miles   Boarded  Capacity
       -----------------------------------------------------------------
       219     01MAR94    9:31  LGA   LON       3442       198       250
       622     01MAR94   12:19  LGA   FRA       3857       207       250
       132     01MAR94   15:35  LGA   YYZ        366       115       178
       271     01MAR94   13:17  LGA   PAR       3635       138       250
       219     02MAR94    9:31  LGA   LON       3442       147       250

Subqueries and Efficiency

Previous Page | Next Page | Top of Page