Retrieving Data from Multiple Tables |
Working with Two or More Query Results |
PROC SQL can combine the results of two or more queries in various ways by using the following set operators:
UNION |
produces all unique rows from both queries. |
EXCEPT |
produces rows that are part of the first query only. |
INTERSECT |
produces rows that are common to both query results. |
OUTER UNION |
concatenates the query results. |
The operator is used between the two queries, for example:
select columns from table set-operator select columns from table;
Place a semicolon after the last SELECT statement only. Set operators combine columns from two queries based on their position in the referenced tables without regard to the individual column names. Columns in the same relative position in the two queries must have the same data types. The column names of the tables in the first query become the column names of the output table. For information about using set operators with more than two query results, see the section about the SQL procedure in the Base SAS Procedures Guide. The following optional keywords give you more control over set operations:
does not suppress duplicate rows. When the keyword ALL is specified, PROC SQL does not make a second pass through the data to eliminate duplicate rows. Thus, using ALL is more efficient than not using it. ALL is not necessary with the OUTER UNION operator.
overlays columns that have the same name in both tables. When used with EXCEPT, INTERSECT, and UNION, CORR suppresses columns that are not in both tables.
Each set operator is described and used in an example based on the following two tables.
Tables Used in Set Operation Examples
Table A x y ------------------ 1 one 2 two 2 two 3 three
Table B x z ------------------ 1 one 2 two 4 four
Whereas join operations combine tables horizontally, set operations combine tables vertically. Therefore, the set diagrams that are included in each section are displayed vertically.
Producing Unique Rows from Both Queries (UNION) |
The UNION operator combines two query results. It produces all the unique rows that result from both queries; that is, it returns a row if it occurs in the first table, the second, or both. UNION does not return duplicate rows. If a row occurs more than once, then only one occurrence is returned.
proc sql; title 'A UNION B'; select * from sql.a union select * from sql.b;
Producing Unique Rows from Both Queries (UNION)
A UNION B x y ------------------ 1 one 2 two 3 three 4 four
You can use the ALL keyword to request that duplicate rows remain in the output.
proc sql; title 'A UNION ALL B'; select * from sql.a union all select * from sql.b;
Producing Rows from Both Queries (UNION ALL)
A UNION ALL B x y ------------------ 1 one 2 two 2 two 3 three 1 one 2 two 4 four
Producing Rows That Are in Only the First Query Result (EXCEPT) |
The EXCEPT operator returns rows that result from the first query but not from the second query. In this example, the row that contains the values 3 and three exists in the first query (table A) only and is returned by EXCEPT.
proc sql; title 'A EXCEPT B'; select * from sql.a except select * from sql.b;
Producing Rows That Are in Only the First Query Result (EXCEPT)
A EXCEPT B x y ------------------ 3 three
Note that the duplicated row in Table A containing the values 2 and two does not appear in the output. EXCEPT does not return duplicate rows that are unmatched by rows in the second query. Adding ALL keeps any duplicate rows that do not occur in the second query.
proc sql; title 'A EXCEPT ALL B'; select * from sql.a except all select * from sql.b;
Producing Rows That Are in Only the First Query Result (EXCEPT ALL)
A EXCEPT ALL B x y ------------------ 2 two 3 three
Producing Rows That Belong to Both Query Results (INTERSECT) |
The INTERSECT operator returns rows from the first query that also occur in the second.
proc sql; title 'A INTERSECT B'; select * from sql.a intersect select * from sql.b;
Producing Rows That Belong to Both Query Results (INTERSECT)
A INTERSECT B x y ------------------ 1 one 2 two
The output of an INTERSECT ALL operation contains the rows produced by the first query that are matched one-to-one with a row produced by the second query. In this example, the output of INTERSECT ALL is the same as INTERSECT.
Concatenating Query Results (OUTER UNION) |
The OUTER UNION operator concatenates the results of the queries. This example concatenates tables A and B.
proc sql; title 'A OUTER UNION B'; select * from sql.a outer union select * from sql.b;
Concatenating the Query Results (OUTER UNION)
A OUTER UNION B x y x z -------------------------------------- 1 one . 2 two . 2 two . 3 three . . 1 one . 2 two . 4 four
Notice that OUTER UNION does not overlay columns from the two tables. To overlay columns in the same position, use the CORRESPONDING keyword.
proc sql; title 'A OUTER UNION CORR B'; select * from sql.a outer union corr select * from sql.b;
Concatenating the Query Results (OUTER UNION CORR)
A OUTER UNION CORR B x y z ---------------------------- 1 one 2 two 2 two 3 three 1 one 2 two 4 four
Producing Rows from the First Query or the Second Query |
There is no keyword in PROC SQL that returns unique rows from the first and second table, but not rows that occur in both. Here is one way you can simulate this operation:
(query1 except query2) union (query2 except query1)
This example shows how to use this operation.
proc sql; title 'A EXCLUSIVE UNION B'; (select * from sql.a except select * from sql.b) union (select * from sql.b except select * from sql.a);
Producing Rows from the First Query or the Second Query
A EXCLUSIVE UNION B x y ------------------ 3 three 4 four
The first EXCEPT returns one unique row from the first table (table A) only. The second EXCEPT returns one unique row from the second table (table B) only. The middle UNION combines the two results. Thus, this query returns the row from the first table that is not in the second table, as well as the row from the second table that is not in the first table.
Copyright © 2008 by SAS Institute Inc., Cary, NC, USA. All rights reserved.