Previous Page | Next Page

Procedures under OpenVMS

SORT Procedure: OpenVMS



Sorts observations in a SAS data set by one or more variables, then stores the resulting sorted observations in a new SAS data set or replaces the original data set.
OpenVMS specifics: available sort routines
See: SORT Procedure in Base SAS Procedures Guide

Syntax
Details
NODUPKEY Option
SORTWKNO= Option
Customizing Collating Sequences
Setting the Host Sort Utility as the Sort Algorithm
Specifying the SORTSEQ= Option with a Host Sort Utility
Example: Creating a SAS View with a Dummy BY Variable
See Also

Syntax

PROC SORT <option(s)> <collating-sequence-option>;

Note:   This is a simplified version of the SORT procedure syntax. For the complete syntax and its explanation, see the SORT procedure in Base SAS Procedures Guide.  [cautionend]

option(s)

NODUPKEY

under OpenVMS, the observation that is returned is unpredictable; that is, the observation returned is not guaranteed to be the first observation that was encountered for that BY variable. For further explanation of the NODUPKEY option, see NODUPKEY Option.

SORTWKNO=n

specifies the number of sort work files to be used by the OpenVMS sort utility. The value for n can be 0 through 10. For further explanation of the SORTWKNO= option, see SORTWKNO= Option.


Details

The SORT procedure sorts observations in a SAS data set by one or more character or numeric variables, either replacing the original data set or creating a new, sorted data set. By default under OpenVMS, the SORT procedure uses the ASCII collating sequence.

The SORT procedure uses the sort utility specified by the SORTPGM system option. By default, when the SORTPGM system option is set to HOST, the SORT procedure uses the OpenVMS sort utility. (An alternate host sort utility, Hypersort V04-003, is also available. For information about how the sort utility is chosen, see SORTPGM= System Option: OpenVMS.)

You can use all of the options available to the SAS sort utility with your host sort. If you specify an option that is not supported by the host sort, then the SAS sort will be used instead. For a complete list of options, see the SORT procedure in the Base SAS Procedures Guide.


NODUPKEY Option

The SAS sort utility and the OpenVMS sort utility differ slightly in their implementation of the NODUPKEY option. If you need to use both the NODUPKEY and EQUALS options (that is, if you need to guarantee that the first observation returned is the first observation that was input), then use the SAS sort utility.

When you use the SAS sort utility, the NODUPKEY option implies the EQUALS option by default. As a result, the observation that is returned for like BY values is the first observation that was encountered for the key BY variable. That is, the observations are returned in the order in which they were input.

By contrast, the OpenVMS sort utility does not support the EQUALS option with the NODUPKEY option. When NODUPKEY is used with the OpenVMS sort utility, the EQUALS option is set to NOEQUALS unconditionally. As a result, when NODUPKEY is specified with the OpenVMS sort utility, the observation that is returned for observations with like BY values is not guaranteed to be the first observation that was encountered for that BY variable. The observation that the OpenVMS sort utility returns when NODUPKEY is in effect is unpredictable.


SORTWKNO= Option

The SORTWKNO= option specifies the number of sort work files to be used by the OpenVMS sort utility. Valid values range from 0 through 99.

The OpenVMS sort utility can support up to 10 work files. If you set SORTWKNO= to 0 and define the ten sort work files, SAS uses the ten files. To use the sort work files, you must define a SORTWORK# logical name for each sort work area. For example:

$DEFINE SORTWORK0  DISK1:[TEMP]
$DEFINE SORTWORK1  DISK2:[TEMP]
$DEFINE SORTWORK2  DISK3:[TEMP]

The SORTWORK= system option can also be used to assign up to ten work files.

The following example uses the SORTWKNO= option to specify that four work files should be used:

libname mylib '[mydata]';

proc sort data=mylib.june sortwkno=4;
   by revenue;
run;


Customizing Collating Sequences

The options EBCDIC, ASCII, NATIONAL, DANISH, SWEDISH, and REVERSE specify collating sequences that are stored in the HOST catalog.

If you want to provide your own collating sequences or change a collating sequence provided for you, use the TRANTAB procedure to create or modify translation tables. For more information about the TRANTAB procedure, see SAS National Language Support (NLS): Reference Guide. When you create your own translation tables, they are stored in your PROFILE catalog, and they override any translation tables that have the same names in the HOST catalog.

Note:   System managers can modify the HOST catalog by copying newly created tables from the PROFILE catalog to the HOST catalog. Then all users can access the new or modified translation table.  [cautionend]

If you are using the SAS windowing environment and want to see the names of the collating sequences that are stored in the HOST catalog, issue the following command from any window:

CATALOG SASHELP.HOST

If you are not using the SAS windowing environment, then issue the following statements to generate a list of the contents of the HOST catalog:

proc catalog catalog=sashelp.host;
contents;
run;

Entries of type TRANTAB are the collating sequences.

To see the contents of a particular translation table, use the following statements:

proc trantab table=table-name;
list;
run;

The contents of collating sequences are displayed in the SAS log.


Setting the Host Sort Utility as the Sort Algorithm

To specify a host sort utility as the sort algorithm:

  1. Set the SORTPGM system option to tell SAS when to use the host sort utility.

    • If SORTPGM=HOST, then SAS will use the OpenVMS sort utility. If you have enabled the Hypersort utility, then SAS will use it as the host sort utility.

    • If SORTPGM=BEST, then SAS chooses the best sorting method (either the SAS sort or the host sort) for the situation.

    For more information, see SORTPGM= System Option: OpenVMS.

Specifying the SORTSEQ= Option with a Host Sort Utility

The SORTSEQ= option enables you to specify the collating sequence for your sort. For a list of valid values, see Base SAS Procedures Guide.

CAUTION:
If you are using a host sort utility to sort your data, then specifying the SORTSEQ= option might corrupt the character BY variables if the sort sequence translation table and its inverse are not one-to-one mappings.

In other words for the sort to work, the translation table must map each character to a unique weight, and the inverse table must map each weight to a unique character variable.  [cautionend]

If your translation tables do not map one-to-one, then you can use one of the following methods to perform your sort:

Note:   After using one of these methods, you might need to perform subsequent BY processing using either the NOTSORTED option or the NOBYSORTED system option. For more information about the NOTSORTED option, see BY Statement in SAS Language Reference: Dictionary. For more information about the NOBYSORTED system option, see BYSORTED System Option in SAS Language Reference: Dictionary.  [cautionend]


Example: Creating a SAS View with a Dummy BY Variable

The following code is an example of creating a SAS view using a dummy BY variable:

options nodate nostimer ls=78 ps=60;
options sortpgm=host msglevel=i;

data one;
   input name $ age;
   datalines;
   anne 35
   ALBERT 10
   JUAN 90
   janet 5
   bridget 23
   BRIAN 45
   ;

data oneview / view=oneview;
   set one;
   name1=upcase(name);
run;

proc sort data=oneview out=final(drop=name1);
   by name1;
run;

proc print data=final;
run;

The output is the following:

Creating a SAS View with a Dummy BY Variable

  The SAS System
Obs        name       age
 1         ALBERT      10
 2         anne        35
 3         BRIAN       45
 4         bridget     23
 5         janet        5
 6         JUAN        90

See Also

Previous Page | Next Page | Top of Page