Previous Page | Next Page

Working with Grouped or Sorted Observations

Learning More

Alternative to sorting observations

Information about an alternative to sorting observations: creating an index that identifies the observations with particular values of a variable, can be found in the "SAS Data Files" section of SAS Language Reference: Concepts.

BY statement and BY-group processing

See SAS Language Reference: Dictionary and SAS Language Reference: Concepts.

Interleaving, merging, and updating SAS data sets

See Interleaving SAS Data Sets, Merging SAS Data Sets, and Updating SAS Data Sets. These operations depend on the BY statement in the DATA step. Interleaving combines data sets in sorted order (Interleaving SAS Data Sets); match-merging joins observations identified by the value of a BY variable (Merging SAS Data Sets); and updating uses a data set containing transactions to change values in a master file Updating SAS Data Sets).

NOTSORTED option

The NOTSORTED option can be used in both DATA and PROC steps, except for the SORT procedure. Information about the NOTSORTED option can be found in Writing Lines to the SAS Log or to an Output File. The NOTSORTED option is useful when data are grouped according to the values of a variable, but the groups are not in ascending or descending order. Using the NOTSORTED option in the BY statement enables SAS to process them.

SORT procedure

The SORT procedure and the role of the BY statement in it is documented in Base SAS Procedures Guide. It also describes how to specify different sorting utilities.

  • When you work with large data sets, plan your work so that you sort the data set as few times as possible. For example, if you need to sort a data set by STATE at the beginning of a program and by CITY within STATE later, sort the data set by STATE and CITY at the beginning of the program.

  • To eliminate observations whose BY values duplicate BY values in other observations (but not necessarily values of other variables), use the NODUPKEY option in the SORT procedure.

  • SAS can sort data in sequences other than English-language EBCDIC or ASCII. Examples include the Danish-Norwegian and Finnish/Swedish sequences.

The SAS documentation for your operating system presents operating system-specific information about the SORT procedure. In general, many points about sorting data depend on the operating system and other local conditions at your site (such as whether various operating system utilities are available).

Previous Page | Next Page | Top of Page