Using the SQL Set Operators Transformation

Problem

You want to combine the results of table-based queries.

Solution

You can use the SQL Set Operators transformation in a SAS Data Integration Studio job. This transformation generates a PROC SQL statement that combines the results of two or more queries in various ways by using the following set operators:
  • UNION: Produces all unique rows from both queries
  • EXCEPT: Produces rows that are part of the first query only
  • INTERSECT: Produces rows that are common to both query results
  • OUTER UNION: Concatenates the query results
The operator is used between the two queries, as shown in the following example:
select columns from table
set-operator
select columns from table;
The semicolon is placed after the last SELECT statement only. Set operators combine columns from two queries based on their position in the reference tables without regard to the individual column names. Columns in the same relative position in the two queries must have the same data types. The column names in the first query become the column names of the output table. Therefore, only its columns are propagated to the output table.
Perform the following tasks:

Tasks

Create and Populate the Job

Perform the following steps to create and populate the job:
  1. Create an empty job.
  2. Select and drag an SQL Set Operators transformation from the Data folder in the Transformations tree. Then, drop it in the empty job on the Diagram tab in the Job Editor window.
  3. Open the properties window of the SQL Set Operators transformation.
  4. Click Set Operators.
  5. Click Add to access the Table Query Selector and select the first table. For example, you could select a table named CONTINENTS_AMERICAS.
  6. Click the Propagate columns button on the toolbar on the Table for the newly added table. This action propagates the columns from the first table query to the output table.
  7. Click Add as often as necessary to select the remaining tables. This sample job also contains tables named CONTINENTS and CONTINENTS_NONAMERICAS.
    The following display shows the tables selected as inputs to the SQL Set Operators transformation:
    SQL Set Operators Tables
    SQL Set Operators Tables
    Note that the table queries are joined with union set operators by default.
The following display shows the resulting SQL set operators process flow in the sample job:
SQL Set Operators Process Flow
SQL Set Operators Process Flow

Configure the Queries

Perform the following steps to configure the table queries:
  1. Click the set operator that you need to configure, such as the Union operator beneath the CONTINENTS_AMERICAS table in the sample job.
  2. Select an operator type in the Set operator type field (such as Intersect).
    The following display shows the set operators section for the table in the sample job:
    Set Operators Section
    Set Operators Section
    Note that you can appropriate options for each operator type. Repeat this process for all of the operators that you need to configure.
  3. Review the SELECT statement for each query.
    The following display shows the SELECT expression for CONTINENTS_AMERICAS table in the sample job:
    SELECT Statement for a Table
    SELECT Statement for a Table
  4. Configure the WHERE, HAVING, ORDER BY, and GROUP BY statements for your table queries as needed. Note that the ORDER BY statement is permitted on the last query only. You can have only one ORDER BY statement in each SQL Set Operators transformation.
    The following display shows the WHERE statement for the CONTINENTS_AMERICAS table.
    WHERE Statement for a Table
    WHERE Statement for a Table
  5. Click Options to review the options for the SQL Set Operators transformation.
    The following display shows debugging options. These options are located in the General section of the Options tab in the SQL Set Operators transformation in the sample job:
    General Options Tab
    General Options Tab
    The following display shows pass-through options. These options are located in the Database pass-through section of the Options tab in the SQL Set Operators transformation in the sample job:
    Database Pass-through Options Tab
    Database pass-through Options tab

Run the Job and Review the Results

Perform the following steps to run the job and view the output:
  1. Right-click on an empty area of the job, and click Run in the pop-up menu. SAS Data Integration Studio generates code for the job and submits it to the SAS Application Server for execution.
  2. If error messages are displayed on the Status tab, read and respond to the messages as needed.
  3. To view the output, right-click the output table and select Open. The output of the sample job is found in a temporary output table. You could also store the output in a permanent target table.
    The following display shows the output of a set operators job:
    Output from a Set Operators Job
    Output from a Set Operators Job
    Note that the names of the row in the output include do not include the text AMERICAS. This text is present in some of the source tables.