Joining a Table to Itself

Problem

You need to produce a subset of information that is based on the relationship between columns in the same table.

Solution

You can join the table to itself by creating the second version of the table with an alias. Then, you can create a query to compare data from columns in the original table to other columns in the aliased table.

Tasks

Join the Table to Itself

Perform the following steps to join a table to itself and use the resulting hierarchy of tables in a query:
  1. Create an SQL query in an empty job. The query should contain the SQL Join transformation, at least one source table, and a target table.
  2. Open the Designer window for the SQL Join transformation. Click Create in the Navigate pane to access the Diagram tab and the SQL Clauses pane.
  3. Drop the same table that was used as a source table for the query in the Diagram tab. You are prompted to supply an alias for the table because it is already being used as a source table for the query. Enter the alias in the Alias field of the properties pane for the table. The dialog box for the alias is shown in the following display.
    Self-Join Alias Dialog Box
    Self-Join Alias Dialog Box
  4. Complete any additional configuration needed to finish the query. The following display shows a sample job that includes a table joined to itself.
    Sample Job with a Table Joined to Itself
    Sample Job with a Table Joined to Itself
The tables in the flow shown on the Diagram tab are reflected in the FROM clause that is highlighted on the Code tab below it. The query that is shown in the sample job pulls the Name variable from the original table (denoted with the us alias). However, it pulls the Population and Area variables from the copy of the original table (denoted with the uscopy alias).