SORTKEY Function

creates a linguistic sort key.
Category: Locale

Syntax

sortKey(string, <locale, strength, case, numeric, order> )

Required Arguments

string
character expression
locale
specifies the locale name in the form of a POSIX name (ja_JP). See Values for the LOCALE= System Option for a list of locale names and Posix values.
strength
The value of strength is related to the collation level. There are five collation-level values. The following table provides information regarding the five levels. The default value for strength is related to the locale.
Value
Type of Collation
Description
PRIMARY or P
PRIMARY specifies differences between base characters (for example, "a" < "b").
It is the strongest difference. For example, dictionaries are divided into different sections by base character.
SECONDARY or S
Accents in the characters are considered secondary differences (for example, "as" < "às" < "at").
Other differences between letters can also be considered secondary differences, depending on the language. A secondary difference is ignored when there is a primary difference anywhere in the strings.
TERTIARY or T
Upper and lower case differences in characters are distinguished at the tertiary level (for example, "ao" < "Ao" < "aò").
An example is the difference between large and small Kana. A tertiary difference is ignored when there is a primary or secondary difference anywhere in the strings.
QUATERNARY or Q
When punctuation is ignored at level 1-3, an additional level can be used to distinguish words with and without punctuation (for example, "ab" < "a-b" < "aB").
This difference is ignored when there is a primary, secondary, or tertiary difference. The quaternary level should be used if ignoring punctuation is required or when processing Japanese text.
IDENTICAL or I
When all other levels are equal, the identical level is used as a tiebreaker. The Unicode code point values of the NFD form of each string are compared at this level, just in case there is no difference at levels 1-4.
For example, only Hebrew cantillation marks are distinguished at this level. This level should be used sparingly, as only code point values differences between two strings is an extremely rare occurrence.
case order
sorts uppercase and lowercase letters. This argument is valid for only TERTIARY, QUATERNARY, or IDENTICAL. The following table provides the values and information for the case order argument.
Value
Description
UPPER or U
Sorts upper case letters first, then the lower case letters.
LOWER or L
Sorts lower case letters first, then the upper case letters.
numeric collation
orders numbers by the numeric value instead of the number's characters.
Value
Description
NUMERIC or N
Order numbers (integers) by the numeric value. For example, "8 Main St." would sort before "45 Main St.".
collation order
There are two types of collation values: Phonebook and Traditional. If you do not select a collation value, then the user's locale-default collation is selected. The following table provides more information.
Value
Description
PHONEBOOK or P
specifies a phonebook style ordering of characters. Select PHONEBOOK only with the German language.
TRADITIONAL or T
specifies a traditional style ordering of characters. Select TRADITIONAL only with the Spanish language.

Details

The SORTKEY function creates a linguistic sort key for data. You must enter at least one argument. If the length of the variable that receives the key is not large enough, the data truncates, and a warning is displayed.
locale
Locale values use the POSIX name (ll_RR). LL represents the two-letter language code, and RR represents the two-letter region code. For example, en_US is the POSIX name for English, United States. en represents the English language, and US represents the United States. If a locale value is not specified, then the session locale is used.
strength
The strength argument determines whether accents or case affect collating or matching text. If no value is specified for strength, then the locale determines the value. The following values can be specified for strength.
PRIMARY
This value includes base letters, for example, the letters, A, a, and Å are all processed the same.
SECONDARY
This value processes data the same as PRIMARY, and accents are processed. The letters A and a are processed equally, and Å is processed as an accented character.
TERTIARY
This value processes data the same as SECONDARY, and the character's case is processed. For example, A, a, and Å are all processed differently.
QUATERNARY
This value processes data the same as TERTIARY, and punctuation is processed.
IDENTICAL
This value process data the same as QUATERNARY, and code point is processed.
case order
specifies to sort data using upper case or lower case letter. The following table shows examples of specifying the UPPER value or the LOWER value.
UPPER
LOWER
Aztec
aztec
aztec
Aztec
Mars
mars
mars
Mars
collation order
The collation order value PHONEBOOK is ignored unless the locale is a German language.
The collation order value TRADITIONAL is ignored unless the locale is a Spanish language.
A warning message displays for other locales.