sql-expression

Produces a value from a sequence of operands and operators.

Syntax

operand operator operand

Required Arguments

operand

is one of the following:

  • a constant, which is a number or a quoted character string (or other special notation) that indicates a fixed value. Constants are also called literals. Constants are described in SAS Functions and CALL Routines: Reference.
  • a column-name, which is described in column-name.
  • a CASE expression, which is described in CASE Expression.
  • any supported SAS function. PROC SQL supports many of the functions available to the SAS DATA step. Some of the functions that are not supported are the variable information functions, functions that work with arrays of data, and functions that operate on rows other than the current row. Other SQL databases support their own sets of functions. Functions are described in the SAS Functions and CALL Routines: Reference.
  • any functions, except those with array elements, that are created with PROC FCMP.
  • the ANSI SQL functions COALESCE, BTRIM, LOWER, UPPER, and SUBSTRING.
  • a summary-function, which is described in summary-function.
  • a query expression, which is described in query-expression.
  • the USER literal, which references the user ID of the person who submitted the program. The user ID that is returned is operating environment-dependent, but PROC SQL uses the same value that the &SYSJOBID macro variable has on the operating environment.

operator

is described in Operators and the Order of Evaluation.

Note: SAS functions, including summary functions, can stand alone as SQL expressions. For example
select min(x) from table;

select scan(y,4) from table;

Details

SAS Functions

PROC SQL supports many of the functions available to the SAS DATA step. Some of the functions that are not supported are the variable information functions and functions that work with arrays of data. Other SQL databases support their own sets of functions. For example, the SCAN function is used in the following query:
    select style, scan(street,1) format=$15.
       from houses;
PROC SQL also supports any user-written functions, except those functions with array elements, that are created using Chapter 19, “FCMP Procedure” in Base SAS Procedures Guide.
See the SAS Functions and CALL Routines: Reference for complete documentation of SAS functions. Summary functions are also SAS functions. For more information, see summary-function.

USER Literal

USER can be specified in a view definition. For example, you can create a view that restricts access to the views in the user's department. Note that the USER literal value is stored in uppercase, so it is advisable to use the UPCASE function when comparing to this value:
   create view myemp as
      select * from dept12.employees
         where upcase(manager)=user;
This view produces a different set of employee information for each manager who references it.

Operators and the Order of Evaluation

The order in which operations are evaluated is the same as in the DATA step with this one exception: NOT is grouped with the logical operators AND and OR in PROC SQL; in the DATA step, NOT is grouped with the unary plus and minus signs.
Unlike missing values in some versions of SQL, missing values in SAS always appear first in the collating sequence. Therefore, in Boolean and comparison operations, the following expressions resolve to true in a predicate:
   3>null
  -3>null
   0>null
You can use parentheses to group values or to nest mathematical expressions. Parentheses make expressions easier to read and can also be used to change the order of evaluation of the operators. Evaluating expressions with parentheses begins at the deepest level of parentheses and moves outward. For example, SAS evaluates A+B*C as A+(B*C), although you can add parentheses to make it evaluate as (A+B)*C for a different result.
Higher priority operations are performed first: that is, group 0 operators are evaluated before group 5 operators. The following table shows the operators and their order of evaluation, including their priority groups.
Operators and Order of Evaluation
Group
Operator
Description
0
( )
forces the expression enclosed to be evaluated first
1
case-expression
selects result values that satisfy specified conditions
2
**
raises to a power
unary +, unary -
indicates a positive or negative number
3
*
multiplies
/
divides
4
+
adds
subtracts
5
||
concatenates
6
<NOT> BETWEEN condition
<NOT> CONTAINS condition
<NOT> EXISTS condition
<NOT> IN condition
IS <NOT> condition
<NOT> LIKE condition
7
=, eq
equals
¬=, ^=, < >, ne
does not equal
>, gt
is greater than
<, lt
is less than
>=, ge
is greater than or equal to
<=, le
is less than or equal to
=*
sounds like (use with character operands only). See Retrieving Values with the SOUNDS-LIKE Operator.
eqt
equal to truncated strings (use with character operands only). See Truncated String Comparison Operators.
gtt
greater than truncated strings
ltt
less than truncated strings
get
greater than or equal to truncated strings
let
less than or equal to truncated strings
net
not equal to truncated strings
8
¬, ^, NOT
indicates logical NOT
9
&, AND
indicates logical AND
10
|, OR
indicates logical OR
Symbols for operators might vary, depending on your operating environment. For more information, see “SAS Operators in Expressions” in Chapter 6 of SAS Language Reference: Concepts.

Truncated String Comparison Operators

PROC SQL supports truncated string comparison operators. (See Group 7 in Operators and Order of Evaluation.) In a truncated string comparison, the comparison is performed after making the strings the same length by truncating the longer string to be the same length as the shorter string. For example, the expression 'TWOSTORY' eqt 'TWO' is true because the string 'TWOSTORY' is reduced to 'TWO' before the comparison is performed. Note that the truncation is performed internally; neither operand is permanently changed.
Note: Unlike the DATA step, PROC SQL does not support the colon operators (such as =:, >:, and <=:) for truncated string comparisons. Use the alphabetic operators (such as EQT, GTT, and LET).

Query Expressions (Subqueries)

A query expression is called a subquery when it is used in a WHERE or HAVING clause. A subquery is a query expression that is nested as part of another query expression. A subquery selects one or more rows from a table based on values in another table.
Depending on the clause that contains it, a subquery can return a single value or multiple values. If more than one subquery is used in a query expression, then the innermost query is evaluated first, then the next innermost query, and so on, moving outward.
PROC SQL allows a subquery (contained in parentheses) at any point in an expression where a simple column value or constant can be used. In this case, a subquery must return a single value, that is, one row with only one column.
The following is an example of a subquery that returns one value. This PROC SQL step subsets the PROCLIB.PAYROLL table based on information in the PROCLIB.STAFF table. (PROCLIB.PAYROLL is shown in Creating a Table from a Query's Result, and PROCLIB.STAFF is shown in Joining Two Tables.) PROCLIB.PAYROLL contains employee identification numbers (IdNumber) and their salaries (Salary) but does not contain their names. If you want to return only the row from PROCLIB.PAYROLL for one employee, then you can use a subquery that queries the PROCLIB.STAFF table, which contains the employees' identification numbers and their names (Lname and Fname).
proc sql;
    title 'Information for Earl Bowden';
    select *
       from proclib.payroll
       where idnumber=
             (select idnum
             from proclib.staff
             where upcase(lname)='BOWDEN');
Query Output – One Value
Information for Earl Bowden
Subqueries can return multiple values. The following example uses the tables PROCLIB.DELAY and PROCLIB.MARCH. These tables contain information about the same flights and have the Flight column in common. The following subquery returns all the values for Flight in PROCLIB.DELAY for international flights. The values from the subquery complete the WHERE clause in the outer query. Thus, when the outer query is executed, only the international flights from PROCLIB.MARCH are in the output.
proc sql outobs=5;
   title 'International Flights from';
   title2 'PROCLIB.MARCH';
   select Flight, Date, Dest, Boarded
      from proclib.march
      where flight in
            (select flight
             from proclib.delay
             where destype='International');
Query Output – Multiple Values
International Flights from PROCLIB.MARCH
Sometimes it is helpful to compare a value with a set of values returned by a subquery. The keywords ANY or ALL can be specified before a subquery when the subquery is the right-hand operand of a comparison. If ALL is specified, then the comparison is true only if it is true for all values that are returned by the subquery. If a subquery returns no rows, then the result of an ALL comparison is true for each row of the outer query.
If ANY is specified, then the comparison is true if it is true for any one of the values that are returned by the subquery. If a subquery returns no rows, then the result of an ANY comparison is false for each row of the outer query.
The following example selects all of the employees in PROCLIB.PAYROLL who earn more than the highest paid ME3:
proc sql;
title "Employees who Earn More than";
title2 "All ME's";
   select *
     from proclib.payroll
     where salary > all (select salary
                         from proclib.payroll
                         where jobcode='ME3');
Query Output Using ALL Comparison
Employees who Earn More than All ME’s
Note: See the first item in Subqueries and Efficiency for a note about efficiency when using ALL.
In order to visually separate a subquery from the rest of the query, you can enclose the subquery in any number of pairs of parentheses.

Correlated Subqueries

In a correlated subquery, the WHERE expression in a subquery refers to values in a table in the outer query. The correlated subquery is evaluated for each row in the outer query. With correlated subqueries, PROC SQL executes the subquery and the outer query together.
The following example uses the PROCLIB.DELAY and PROCLIB.MARCH tables. A DATA step ( PROCLIB.DELAY) creates PROCLIB.DELAY. PROCLIB.MARCH is shown in Producing All the Possible Combinations of the Values in a Column. PROCLIB.DELAY has the Flight, Date, Orig, and Dest columns in common with PROCLIB.MARCH:
proc sql outobs=5;
   title 'International Flights';
   select *
      from proclib.march
      where 'International' in
           (select destype
            from proclib.delay
            where march.Flight=delay.Flight);
The subquery resolves by substituting every value for MARCH.Flight into the subquery's WHERE clause, one row at a time. For example, when MARCH.Flight=219, the subquery resolves as follows:
  1. PROC SQL retrieves all the rows from DELAY where Flight=219 and passes their DESTYPE values to the WHERE clause.
  2. PROC SQL uses the DESTYPE values to complete the WHERE clause:
    where 'International' in
       ('International','International', ...)
  3. The WHERE clause checks to determine whether International is in the list. Because it is, all rows from MARCH that have a value of 219 for Flight become part of the output.
The following output contains the rows from MARCH for international flights only.
Correlated Subquery Output
International Flights

Subqueries and Efficiency

  • Use the MAX function in a subquery instead of the ALL keyword before the subquery. For example, the following queries produce the same result, but the second query is more efficient:
     proc sql;
        select * from proclib.payroll
        where salary> all(select salary
                          from proclib.payroll
                          where jobcode='ME3');
    
     proc sql;
        select * from proclib.payroll
        where salary> (select max(salary)
                       from proclib.payroll
                       where jobcode='ME3');
  • With subqueries, use IN instead of EXISTS when possible. For example, the following queries produce the same result, but the second query is usually more efficient:
    proc sql;
       select *
          from proclib.payroll p
          where exists (select *
                        from staff s
                        where p.idnum=s.idnum
                           and state='CT');
    
    proc sql;
       select *
          from proclib.payroll
          where idnum in (select idnum
                             from staff
                             where state='CT');