GROUP BY Clause
Specifies how to group the data for summarizing.
Syntax
GROUP BY group-by-item <,
…, group-by-item>
Required Argument
group-by-item
is one of the following:
integer
is a positive integer
that equates to a column's position.
column-name
is
the name of a column or a column alias. See column-name.
Details
-
You can specify more than one
group-by-item to
get more detailed reports. Both the grouping of multiple items and
the BY statement of a PROC step are evaluated in similar ways. If
more than one
group-by-item is
specified, then the first one determines the major grouping.
-
Integers can be substituted for
column names (that is, SELECT object-items) in the GROUP BY clause.
For example, if the
group-by-item is
2, then the results are grouped by the values in the second column
of the SELECT clause list. Using integers can shorten your coding
and enable you to group by the value of an unnamed expression in the
SELECT list. Note that if you use a floating-point value (for example,
2.3), then PROC SQL ignores the decimal portion.
-
The data does not have to be sorted
in the order of the group-by values because PROC SQL handles sorting
automatically. You can use the ORDER BY clause to specify the order
in which rows are displayed in the result table.
-
If you specify a GROUP BY clause
in a query that does not contain a summary function, then your clause
is transformed into an ORDER BY clause and a message to that effect
is written to the SAS log.
-
You can group the output by the
values that are returned by an expression. For example, if X is a
numeric variable, then the output of the following is grouped by the
integer portion of values of X:
select x, sum(y)
from table1
group by int(x);
Similarly, if Y is a character variable,
then the output of the following is grouped by the second character
of values of Y:
select sum(x), y
from table1
group by substring(y from 2 for 1);
Note that an expression
that contains only numeric literals (and functions of numeric literals)
or only character literals (and functions of character literals) is
ignored.
An expression in a
GROUP BY clause cannot be a summary function. For example, the following
GROUP BY clause is not valid:
group by sum(x)