|Statements with the Same Function in Multiple Procedures|
|See also:||Creating Titles That Contain BY-Group Information|
BY <DESCENDING> variable-1
<... <DESCENDING> variable-n>
specifies the variable that the procedure uses to form BY groups. You can specify more than one variable. If you do not use the NOTSORTED option in the BY statement, then either the observations in the data set must be sorted by all the variables that you specify, or they must be indexed appropriately. Variables in a BY statement are called BY variables.
specifies that the observations are sorted in descending order by the variable that immediately follows the word DESCENDING in the BY statement.
specifies that observations are not necessarily sorted in alphabetic or numeric order. The observations are grouped in another way, for example, chronological order.
The requirement for ordering or indexing observations according to the values of BY variables is suspended for BY-group processing when you use the NOTSORTED option. In fact, the procedure does not use an index if you specify NOTSORTED. The procedure defines a BY group as a set of contiguous observations that have the same values for all BY variables. If observations with the same values for the BY variables are not contiguous, then the procedure treats each contiguous set as a separate BY group.
Note: You cannot use the NOTSORTED option in a PROC SORT step.
Note: You cannot use the GROUPFORMAT option, which is available in the BY statement in a DATA step, in a BY statement in any PROC step.
Procedures create output for each BY group. For example, the elementary statistics procedures and the scoring procedures perform separate analyses for each BY group. The reporting procedures produce a report for each BY group.
Note: All Base SAS procedures except PROC PRINT process BY groups independently. PROC PRINT can report the number of observations in each BY group as well as the number of observations in all BY groups. Similarly, PROC PRINT can sum numeric variables in each BY group and across all BY groups.
You can use only one BY statement in each PROC step. When you use a BY statement, the procedure expects an input data set that is sorted by the order of the BY variables or one that has an appropriate index. If your input data set does not meet these criteria, then an error occurs. Either sort it with the SORT procedure or create an appropriate index on the BY variables.
Depending on the order of your data, you might need to use the NOTSORTED or DESCENDING option in the BY statement in the PROC step.
For more information on the BY statement, see the SAS Language Reference: Dictionary.
For more information on PROC SORT, see The SORT Procedure.
For more information on creating indexes, see INDEX CREATE Statement.
|Formatting BY-Variable Values|
When a procedure is submitted with a BY statement, the following actions are taken with respect to processing of BY groups:
The procedure determines whether the data is sorted by the internal (unformatted) values of the BY variable(s).
The procedure determines whether a format has been applied to the BY variable(s). If the BY variable is numeric and has no user-applied format, then the BEST12. format is applied for the purpose of BY-group processing.
The procedure continues adding observations to the current BY group until both the internal and the formatted values of the BY variable or variables change.
This process can have unexpected results if, for example, nonconsecutive internal BY values share the same formatted value. In this case, the formatted value is represented in different BY groups. Alternatively, if different consecutive internal BY values share the same formatted value, then these observations are grouped into the same BY group.
|Base SAS Procedures That Support the BY Statement|
|CALENDAR||REPORT (nonwindowing environment only)|
Note: In the SORT procedure, the BY statement specifies how to sort the data. In the other procedures, the BY statement specifies how the data is currently sorted.
This example uses a BY statement in a PROC PRINT step. There is output for each value of the BY variable Year. The DEBATE data set is created in Example: Temporarily Dissociating a Format from a Variable.
options nodate pageno=1 linesize=64 pagesize=40; proc print data=debate noobs; by year; title 'Printing of Team Members'; title2 'by Year'; run;
Printing of Team Members 1 by Year ------------------------ Year=Freshman ------------------------- Name Gender GPA Capiccio m 3.598 Tucker m 3.901 ------------------------ Year=Sophomore ------------------------ Name Gender GPA Bagwell f 3.722 Berry m 3.198 Metcalf m 3.342 ------------------------- Year=Junior -------------------------- Name Gender GPA Gold f 3.609 Gray f 3.177 Syme f 3.883 ------------------------- Year=Senior -------------------------- Name Gender GPA Baglione f 4.000 Carr m 3.750 Hall m 3.574 Lewis m 3.421