Previous Page | Next Page

Statements with the Same Function in Multiple Procedures

BY


Orders the output according to the BY groups.
See also: Creating Titles That Contain BY-Group Information

BY <DESCENDING> variable-1
<... <DESCENDING> variable-n>
<NOTSORTED>;

Required Arguments

variable

specifies the variable that the procedure uses to form BY groups. You can specify more than one variable. If you do not use the NOTSORTED option in the BY statement, then either the observations in the data set must be sorted by all the variables that you specify, or they must be indexed appropriately. Variables in a BY statement are called BY variables.


Options

DESCENDING

specifies that the observations are sorted in descending order by the variable that immediately follows the word DESCENDING in the BY statement.

NOTSORTED

specifies that observations are not necessarily sorted in alphabetic or numeric order. The observations are grouped in another way, for example, chronological order.

The requirement for ordering or indexing observations according to the values of BY variables is suspended for BY-group processing when you use the NOTSORTED option. In fact, the procedure does not use an index if you specify NOTSORTED. The procedure defines a BY group as a set of contiguous observations that have the same values for all BY variables. If observations with the same values for the BY variables are not contiguous, then the procedure treats each contiguous set as a separate BY group.

Note:   You cannot use the NOTSORTED option in a PROC SORT step.  [cautionend]

Note:   You cannot use the GROUPFORMAT option, which is available in the BY statement in a DATA step, in a BY statement in any PROC step.  [cautionend]


BY-Group Processing

Procedures create output for each BY group. For example, the elementary statistics procedures and the scoring procedures perform separate analyses for each BY group. The reporting procedures produce a report for each BY group.

Note:   All Base SAS procedures except PROC PRINT process BY groups independently. PROC PRINT can report the number of observations in each BY group as well as the number of observations in all BY groups. Similarly, PROC PRINT can sum numeric variables in each BY group and across all BY groups.  [cautionend]

You can use only one BY statement in each PROC step. When you use a BY statement, the procedure expects an input data set that is sorted by the order of the BY variables or one that has an appropriate index. If your input data set does not meet these criteria, then an error occurs. Either sort it with the SORT procedure or create an appropriate index on the BY variables.

Depending on the order of your data, you might need to use the NOTSORTED or DESCENDING option in the BY statement in the PROC step.


Formatting BY-Variable Values

When a procedure is submitted with a BY statement, the following actions are taken with respect to processing of BY groups:

  1. The procedure determines whether the data is sorted by the internal (unformatted) values of the BY variable(s).

  2. The procedure determines whether a format has been applied to the BY variable(s). If the BY variable is numeric and has no user-applied format, then the BEST12. format is applied for the purpose of BY-group processing.

  3. The procedure continues adding observations to the current BY group until both the internal and the formatted values of the BY variable or variables change.

This process can have unexpected results if, for example, nonconsecutive internal BY values share the same formatted value. In this case, the formatted value is represented in different BY groups. Alternatively, if different consecutive internal BY values share the same formatted value, then these observations are grouped into the same BY group.


Base SAS Procedures That Support the BY Statement

CALENDAR REPORT (nonwindowing environment only)
CHART SORT (required)
COMPARE STANDARD
CORR SUMMARY
FREQ TABULATE
MEANS TIMEPLOT
PLOT TRANSPOSE
PRINT UNIVARIATE
RANK

Note:   In the SORT procedure, the BY statement specifies how to sort the data. In the other procedures, the BY statement specifies how the data is currently sorted.  [cautionend]


Example

This example uses a BY statement in a PROC PRINT step. There is output for each value of the BY variable Year. The DEBATE data set is created in Example: Temporarily Dissociating a Format from a Variable.

options nodate pageno=1 linesize=64 
        pagesize=40;
proc print data=debate noobs;
   by year;
   title 'Printing of Team Members';
   title2 'by Year';
run;

                    Printing of Team Members                   1
                            by Year

------------------------ Year=Freshman -------------------------

                    Name      Gender     GPA

                  Capiccio      m       3.598
                  Tucker        m       3.901


------------------------ Year=Sophomore ------------------------

                    Name      Gender     GPA

                   Bagwell      f       3.722
                   Berry        m       3.198
                   Metcalf      m       3.342


------------------------- Year=Junior --------------------------

                    Name    Gender     GPA

                    Gold      f       3.609
                    Gray      f       3.177
                    Syme      f       3.883


------------------------- Year=Senior --------------------------

                  Name        Gender     GPA

                  Baglione      f       4.000
                  Carr          m       3.750
                  Hall          m       3.574
                  Lewis         m       3.421

Previous Page | Next Page | Top of Page