![]() | ![]() | ![]() | ![]() |
There are many reasons for sorting your data: to organize data for a report, before combining data sets, or before using a BY statement in another PROC or DATA step. Fortunately, PROC SORT is quite simple. The basic form of this procedure is
PROC SORT; BY variable-1 ... variable-n;
The variables named in the BY statement are called BY variables. You can specify as many BY variables as you wish. With one BY variable, SAS sorts the data based on the values of that variable. With more than one variable, SAS sorts observations by the first variable, then by the second variable within categories of the first, and so on. A BY group is all the observations that have the same values of BY variables. If, for example, your BY variable is State then all the observations for North Dakota form one BY group.
The DATA= and OUT= options specify the input and output data sets. If you don't specify the DATA= option, then SAS will use the most recently created data set. If you don't specify the OUT= option, then SAS will replace the original data set with the newly sorted version. This sample statement tells SAS to sort the data set named MESSY, and then put the sorted data into a data set named NEAT:
PROC SORT DATA = messy OUT = neat;
The NODUPKEY option tells SAS to eliminate any duplicate observations that have the same values for the BY variables. To use this option, just add NODUPKEY to the PROC SORT statement:
PROC SORT DATA = messy OUT = neat NODUPKEY;
By default SAS sorts data in ascending order, from lowest to highest or from A to Z. To have your data sorted from highest to lowest, add the keyword DESCENDING to the BY statement before each variable that should be sorted from highest to lowest. This statement tells SAS to sort first by State (from A to Z) and then by City (from Z to A) within State:
BY State DESCENDING City;
Example: The following data show the average length in feet of selected whales and sharks:
beluga | whale | 15 |
whale | shark | 40 |
basking | shark | 30 |
gray | whale | 50 |
mako | shark | 12 |
sperm | whale | 60 |
dwarf | shark | .5 |
whale | shark | 40 |
humpback | . | 50 |
blue | whale | 100 |
killer | whale | 30 |
This program reads and sorts the data:
DATA marine; INFILE 'c:\MyRawData\Sealife.dat'; INPUT Name $ Family $ Length; * Sort the data; PROC SORT DATA = marine OUT = seasort NODUPKEY; BY Family DESCENDING Length; PROC PRINT DATA = seasort; TITLE 'Whales and Sharks'; RUN; |
The DATA step reads the raw data from a file called Sealife.dat and creates a SAS data set named MARINE. Then PROC SORT rearranges the observations by family in ascending order, and by length in descending order. The NODUPKEY option of PROC SORT eliminates any duplicates, while the OUT= option writes the sorted data into a new data set named SEASORT. The output from the PROC PRINT looks like this:
Whales and Sharks 1 OBS Name Family Length 1 humpback 50.0 2 whale shark 40.0 3 basking shark 30.0 4 mako shark 12.0 5 dwarf shark 0.5 6 blue whale 100.0 7 sperm whale 60.0 8 gray whale 50.0 9 killer whale 30.0 10 beluga whale 15.0 |
Notice that the humpback with a missing value for Family became observation one. That is because missing values are always low for both numeric and character variables. Also, the NODUPKEY option eliminated a duplicate observation for the whale shark. The log contains these notes showing that the sorted data set has one fewer observation than the original data set.
NOTE: The data set WORK.MARINE has 11 observations and 3 variables. NOTE: 1 observations with duplicate key values were deleted. NOTE: the data set WORK.SEASORT has 10 observations and 3 variables. |
About the Authors
This Quick Tip was reproduced with permission from The Little SAS Book: A Primer, Second Edition by Lora D. Delwiche and Susan J. Slaughter, pp. 94-95. Copyright © 1998 by SAS Institute Inc., Cary, NC, USA. The authors also have a third edition available that covers SAS 9.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
Type: | Sample |
Topic: | Non SAS Authors ==> Susan Slaughter Non SAS Authors ==> Lora Delwiche SAS Reference ==> Procedures ==> SORT |
Date Modified: | 2005-08-02 03:02:31 |
Date Created: | 2004-10-15 08:42:09 |
Product Family | Product | Host | SAS Release | |
Starting | Ending | |||
SAS System | Base SAS | All | n/a | n/a |