Filtering Grouped Data

Overview of Filtering Grouped Data

You can use a HAVING clause with a GROUP BY clause to filter grouped data. The HAVING clause affects groups in a way that is similar to the way in which a WHERE clause affects individual rows. When you use a HAVING clause, PROC SQL displays only the groups that satisfy the HAVING expression.

Using a Simple HAVING Clause

The following example groups the features in the SQL.FEATURES table by type and then displays only the numbers of islands, oceans, and seas:
libname sql 'SAS-library';

proc sql;
   title 'Numbers of Islands, Oceans, and Seas';
   select Type, count(*) as Number
      from sql.features
      group by Type
      having Type in ('Island', 'Ocean', 'Sea')
      order by Type;
Using a Simple HAVING Clause
Numbers of Islands, Oceans, and Seas

Choosing between HAVING and WHERE

The differences between the HAVING clause and the WHERE clause are shown in the following table. Because you use the HAVING clause when you work with groups of data, queries that contain a HAVING clause usually also contain the following:
  • a GROUP BY clause
  • an aggregate function
Note: If you use a HAVING clause without a GROUP BY clause and if the query references at least one aggregate function, PROC SQL treats the input data as if it all comes from a single group of data.
Differences between the HAVING Clause and WHERE Clause
HAVING clause attributes
WHERE clause attributes
is typically used to specify conditions for including or excluding groups of rows from a table.
is used to specify conditions for including or excluding individual rows from a table.
must follow the GROUP BY clause in a query, if used with a GROUP BY clause.
must precede the GROUP BY clause in a query, if used with a GROUP BY clause.
is affected by a GROUP BY clause, when there is no GROUP BY clause, the HAVING clause is treated like a WHERE clause.
is not affected by a GROUP BY clause.
is processed after the GROUP BY clause and any aggregate functions.
is processed before a GROUP BY clause, if there is one, and before any aggregate functions.

Using HAVING with Aggregate Functions

The following query returns the populations of all continents that have more than 15 countries:
libname sql 'SAS-library';

proc sql;
   title 'Total Populations of Continents with More than 15 Countries';
   select Continent, 
          sum(Population) as TotalPopulation format=comma16.,
          count(*) as Count
      from sql.countries
      group by Continent
      having count(*) gt 15
      order by Continent;
The HAVING expression contains the COUNT function, which counts the number of rows within each group.
Using HAVING with the COUNT Function
Total Populations of Continents with More than 15 Countries