query-expression

Retrieves data from tables.

See: table-expression
Query Expressions (Subqueries)
In-Line Views

Syntax

table-expression <set-operator table-expression> <…set-operator table-expression>

Required Arguments

table-expression

is described in table-expression.

set-operator

is one of the following:

INTERSECT <CORRESPONDING> <ALL>
OUTER UNION <CORRESPONDING>
UNION <CORRESPONDING> <ALL>
EXCEPT <CORRESPONDING> <ALL>

Details

Query Expressions and Table Expressions

A query expression is one or more table expressions. Multiple table expressions are linked by set operators. The following figure illustrates the relationship between table expressions and query expressions.
Relationship between table expressions and query expressions

Set Operators

PROC SQL provides these set operators:
OUTER UNION
concatenates the query results.
UNION
produces all unique rows from both queries.
EXCEPT
produces rows that are part of the first query only.
INTERSECT
produces rows that are common to both query results.
A query expression with set operators is evaluated as follows.
  • Each table expression is evaluated to produce an (internal) intermediate result table.
  • Each intermediate result table then becomes an operand linked with a set operator to form an expression. For example, A UNION B.
  • If the query expression involves more than two table expressions, then the result from the first two becomes an operand for the next set operator and operand, such as (A UNION B) EXCEPT C, ((A UNION B) EXCEPT C) INTERSECT D, and so on.
  • Evaluating a query expression produces a single output table.
Set operators follow this order of precedence unless they are overridden by parentheses in the expressions: INTERSECT is evaluated first. OUTER UNION, UNION, and EXCEPT have the same level of precedence.
PROC SQL performs set operations even if the tables or views that are referred to in the table expressions do not have the same number of columns. The reason for this behavior is that the ANSI standard for SQL requires that tables or views that are involved in a set operation have the same number of columns and that the columns have matching data types. If a set operation is performed on a table or view that has fewer columns than the one or ones with which it is being linked, then PROC SQL extends the table or view with fewer columns by creating columns with missing values of the appropriate data type. This temporary alteration enables the set operation to be performed correctly.

CORRESPONDING (CORR) Keyword

The CORRESPONDING keyword is used only when a set operator is specified. CORR causes PROC SQL to match the columns in table expressions by name and not by ordinal position. Columns that do not match by name are excluded from the result table, except for the OUTER UNION operator. See OUTER UNION.
For example, when performing a set operation on two table expressions, PROC SQL matches the first specified column-name (listed in the SELECT clause) from one table expression with the first specified column-name from the other. If CORR is omitted, then PROC SQL matches the columns by ordinal position.

ALL Keyword

The set operators automatically eliminate duplicate rows from their output tables. The optional ALL keyword preserves the duplicate rows, reduces the execution by one step, and thereby improves the query expression's performance. You use it when you want to display all the rows resulting from the table expressions, rather than just the unique rows. The ALL keyword is used only when a set operator is also specified.

OUTER UNION

Performing an OUTER UNION is very similar to performing the SAS DATA step with a SET statement. The OUTER UNION concatenates the intermediate results from the table expressions. Thus, the result table for the query expression contains all the rows produced by the first table expression followed by all the rows produced by the second table expression. Columns with the same name are in separate columns in the result table.
For example, the following query expression concatenates the ME1 and ME2 tables but does not overlay like-named columns. Outer Union of ME1 and ME2 Tables shows the result.
data me1;
   input IDnum $ Jobcode $ Salary Bonus;
   datalines;
1400   ME1         29769  587
1403   ME1         28072  342
1120   ME1         28619  986
1120   ME1         28619  986
;

data me2;
   input IDnum $ Jobcode $ Salary;
   datalines;
1653   ME2         35108
1782   ME2         35345
1244   ME2         36925
;
proc sql ;
   title 'ME1';
   select * from me1;
   title 'ME2';
   select * from me2;
ME1 and ME2 Tables
ME1 and ME2 Tables
proc sql;
   title 'ME1 and ME2: OUTER UNION';
   select *
      from me1
   outer union
   select *
      from me2;
Outer Union of ME1 and ME2 Tables
Outer Union of ME1 and ME2 Tables
Concatenating tables with the OUTER UNION set operator is similar to performing a union join. See Union Joins for more information.
To overlay columns with the same name, use the CORRESPONDING keyword.
proc sql;
   title 'ME1 and ME2: OUTER UNION CORRESPONDING';
   select *
      from me1
   outer union corr
   select *
      from me2;
Outer Union Corresponding
Outer Union Corresponding
In the resulting concatenated table, notice the following:
  • OUTER UNION CORRESPONDING retains all nonmatching columns.
  • For columns with the same name, if a value is missing from the result of the first table expression, then the value in that column from the second table expression is inserted.
  • The ALL keyword is not used with OUTER UNION because this operator's default action is to include all rows in a result table. Thus, both rows from the table ME1 where IDnum is 1120 appear in the output.

UNION

The UNION operator produces a table that contains all the unique rows that result from both table expressions. That is, the output table contains rows produced by the first table expression, the second table expression, or both.
Columns are appended by position in the tables, regardless of the column names. However, the data type of the corresponding columns must match or the union will not occur. PROC SQL issues a warning message and stops executing.
The names of the columns in the output table are the names of the columns from the first table expression unless a column (such as an expression) has no name in the first table expression. In such a case, the name of that column in the output table is the name of the respective column in the second table expression.
In the following example, PROC SQL combines the two tables:
proc sql;
   title 'ME1 and ME2: UNION';
   select *
      from me1
   union
   select *
      from me2;
Union of ME1 and ME2 Tables
Union of ME1 and ME2 Tables
In the following example, ALL includes the duplicate row from ME1. In addition, ALL changes the sorting by specifying that PROC SQL make one pass only. Thus, the values from ME2 are simply appended to the values from ME1.
proc sql;
   title 'ME1 and ME2: UNION ALL';
   select *
      from me1
   union all
   select *
      from me2;
Union All
Union All
See Combining Two Tables for another example.

EXCEPT

The EXCEPT operator produces (from the first table expression) an output table that has unique rows that are not in the second table expression. If the intermediate result from the first table expression has at least one occurrence of a row that is not in the intermediate result of the second table expression, then that row (from the first table expression) is included in the result table.
In the following example, the IN_USA table contains flights to cities within and outside the USA. The OUT_USA table contains flights only to cities outside the USA.
data in_usa;
   input Flight $ Dest $;
   datalines;
145 ORD
156 WAS
188 LAX
193 FRA
207 LON
;
data OUT_USA;
   input Flight $ Dest $;
   datalines;
193 FRA
207 LON
311 SJA
;
proc sql;
   title 'IN_USA';
   select * from in_usa;
   title 'OUT_USA';
   select * from out_usa;
Source Tables for Except Examples
IN_USA and OUT_USA Tables
This example returns only the rows from IN_USA that are not also in OUT_USA:
proc sql;
    title 'Flights from IN_USA Only';
    select * from in_usa
    except
    select * from out_usa;
Flights from IN_USA Only
Flights from IN_USA Only

INTERSECT

The INTERSECT operator produces an output table that has rows that are common to both tables. For example, using the IN_USA and OUT_USA tables shown above, the following example returns rows that are in both tables:
proc sql;
    title 'Flights from Both IN_USA and OUT_USA';
    select * from in_usa
    intersect
    select * from out_usa;
Flights from Both IN_USA and OUT_USA
Flights from Both IN_USA and OUT_USA