DS2 BY-group processing
groups the rows from input tables and orders the rows by values of
one or more columns in the BY statement.
With in-database processing,
data is distributed on different data partitions. Each DS2 thread
running inside the database has access to one data partition. Each
DS2 thread can group and order only the rows in the same data partition.
Consequently, the data partition might have only part of the entire
group of data. You must do a final aggregation in the main data program.
But, in some instances,
it is necessary for each thread to process the entire group of data.
The SAS In-Database Code Accelerator provides a way to redistribute
the input table to the thread program with a BY statement so that
the entire group of data resides on the same data partition.
The PROC DS2 statement
BYPARTITION argument controls whether the input data is re-partitioned.
By default, the input data for the DS2 program is automatically re-partitioned
by the first BY variable. All of the BY groups are in the same data
partition and processed by the same thread. Each thread does the BY
processing for the entire group of data. You might not need to do
the final aggregation in the main data program.