SAS Institute. The Power to Know

Learning Center

Question from the Field

Question: When is a correlated subquery desirable or necessary?

Answer: Many queries can be evaluated by executing the subquery once and substituting the resulting value or values into the WHERE clause of the outer query. In queries that include a correlated subquery (also known as a repeating subquery), the subquery depends on the outer query for its values. This means that the subquery is executed repeatedly, once for each row that might be selected by the outer query.

In the beginning, databases did not support joins; thus, the only method of returning the desired results was the subquery. If the query with a correlated subquery can be re-written as a join, it will be more efficient to do so. If it can't, then the correlated sub query is required to get the results.

For efficiency in subqueries, 1.) use the MAX function in a subquery instead of the ALL keyword before the subquery; and 2.) use the IN operator instead of the EXISTS keyword, when possible.

For example, the following queries produce the same result, but the second query is more efficient:
proc sql;
    select * from work.payroll
    where salary> all(select salary
                      from work.payroll
                      where jobcode='ME3');

 proc sql;
    select * from work.payroll
    where salary> (select max(salary)
                   from work.payroll
                 where jobcode='ME3');


View question archives.

These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.