Previous Page | Next Page

Functions for NLS

Internationalization Compatibility for SAS String Functions

SAS provides string functions and CALL routines that allow you to easily manipulate your character data. Many of the original SAS string functions assume that the size of one character is always one byte. This process works well for data in a single-byte character set (SBCS). However, when some of these functions and CALL routines are used with data in a double-byte character set (DBCS) or multi-byte character set (MBCS), the data is often handled improperly and produce incorrect results.

DBCS encodings require a varying number of bytes to represent each character. MBCS is sometimes used as a synonym for DBCS.

To solve this problem SAS introduced a set of string functions and CALL routines, called K functions, for those string manipulations where DBCS and MBCS data must be handled carefully. This page shows the level of I18N compatibility for each SAS string function. I18N is the abbreviation for internationalization. Compatibility indicates whether a program using a particular string function can be adapted to different languages and locales without program changes.

The user needs to understand the difference between byte-based offset-length and character-based offset-length in order to use the K functions properly. Most K functions require the character-based offset or length. Under SBCS environments, the byte-based unit is identical to character-based unit; however, under DBCS or MBCS environment, there are significant differences, and programmers need to distinguish them. The users might need to change the programming logic in order to use the K functions. Most K functions require strings encoded in current SAS session encoding.

String functions are assigned I18N levels depending on whether the functions can process DBCS, MBCS, or SBCS. Here are descriptions of the levels:

I18N Level 0

This function is designed for SBCS data. Do not use this function to process DBCS or MBCS data.

I18N Level 1

This function should be avoided, if possible, if you are using a non-English language. The I18N Level 1 functions might not work correctly with DBCS or MBCS encodings under certain circumstances.

I18N Level 2

This function can be used for SBCS, DBCS, and MBCS (UTF-8) data.

SAS String Functions
Function Description I18N Level 0 I18N Level 1 I18N Level 2
ANYALNUM
Searches a character string for an alphanumeric character, and returns the first position at which the character is found.

X
ANYALPHA
Searches a character string for an alphabetic character, and returns the first position at which the character is found.

X
ANYCNTRL
Searches a character string for a control character, and returns the first position at which that character is found.

X
ANYDIGIT
Searches a character string for a digit, and returns the first position at which the digit is found.

X
ANYFIRST
Searches a character string for a character that is valid as the first character in a SAS variable name under VALIDVARNAME=V7, and returns the first position at which that character is found.

X
ANYGRAPH
Searches a character string for a graphical character, and returns the first position at which that character is found.

X
ANYLOWER
Searches a character string for a lowercase letter, and returns the first position at which the letter is found.

X
ANYNAME
Searches a character string for a character that is valid in a SAS variable name under VALIDVARNAME=V7, and returns the first position at which that character is found.

X
ANYPRINT
Searches a character string for a printable character, and returns the first position at which that character is found.

X
ANYPUNCT
searches a character string for a punctuation character, and returns the first position at which that character is found.

X
ANYSPACE
Searches a character string for a white-space character (blank, horizontal and vertical tab, carriage return, line feed, and form feed). Returns the first position at which that character is found.

X
ANYUPPER
Searches a character string for an uppercase letter, and returns the first position at which the letter is found.

X
ANYXDIGIT
Searches a character string for a hexadecimal character that represents a digit, and returns the first position at which that character is found.

X
BYTE
Returns one character in the ASCII or the EBCDIC collating sequence. X

CAT
Does not remove leading or trailing blanks, and returns a concatenated character string.

X
CATS
Removes leading and trailing blanks, and returns a concatenated character string. X

CATT
Removes trailing blanks, and returns a concatenated character string. X

CATX
Removes leading and trailing blanks, inserts delimiters, and returns a character string. X

CHOOSEC
Returns a character value that represents the results of choosing from a list of arguments.

X
CHOOSEN
Returns a numeric value that represents the results of choosing from a list of arguments.

X
COALESCEC
Returns the first non-missing value from a list of numeric arguments.

X
COLLATE
Returns a character string in ASCII or EBCDIC collating sequence. X

COMPARE
Returns the position of the leftmost character by which two strings differ, or returns 0 if there is no difference. X

COMPBL
Removes multiple blanks from a character string. X

COMPGED
Returns the generalized edit distance between two strings. X

COMPLEV
Returns the Levenshtein edit distance between two strings. X

COMPRESS
Returns a character string with specified characters removed from the original string. X

COUNT
Counts the number of times that a specified substring appears within a character string.
X
COUNTC
Counts the number of characters in a string that appear or do not appear in a list of characters.
X
DEQUOTE
Removes matching quotation marks from a character string that begins with a quotation mark, and deletes all characters to the right of the closing quotation mark.

X
FIND
Searches for a specific substring of characters within a character string.
X
FINDC
Searches a string for any character in a list of characters.
X
HTMLDECODE
Decodes a string that contains HTML numeric character references or HTML character entity references, and returns the decoded string.

X
HTMLENCODE
Encodes characters using HTML character entity references, and returns the encoded string.

X
IFC
Returns a character value based on whether an expression is true, false, or missing.

X
IFN
Returns a numeric value based on whether an expression is true, false, or missing.

X
INDEX
Searches a character expression for a string of characters, and returns the position of the string's first character for the first occurrence of the string. X

INDEXC
Searches a character expression for any of the specified characters, and returns the position of that character. X

INDEXW
Searches a character expression for a string that is specified as a word, and returns the position of the first character in the word. X

KCOMPARE Function
Returns the result of a comparison of character expressions.

X
KCOMPRESS Function
Removes specified characters from a character expression.

X
KCOUNT Function
Returns the number of double-byte characters in an expression.

X
KCVT Function
Converts data from one type of encoding data to another encoding data.

X
KINDEX Function
Searches a character expression for a string of characters.

X
KINDEXC Function
Searches a character expression for specified characters.

X
KLEFT Function
Left-aligns a character expression by removing unnecessary leading DBCS blanks and SO-SI.

X
KLENGTH Function
Returns the length of an argument.

X
KLOWCASE Function
Converts all letters in an argument to lowercase.

X
KREVERSE Function
Reverses a character expression.

X
KRIGHT Function
Right-aligns a character expression by trimming trailing DBCS blanks and SO-SI.

X
KSCAN Function
Selects a specified word from a character expression.

X
KSTRCAT Function
Concatenates two or more character expressions.

X
KSUBSTR Function
Extracts a substring from an argument.

X
KSUBSTRB Function
Extracts a substring from an argument according to the byte position of the substring in the argument.

X
KTRANSLATE Function
Replaces specific characters in a character expression.

X
KTRIM Function
Removes trailing DBCS blanks and SO-SI from character expressions.

X
KTRUNCATE Function
Truncates a numeric value to a specified length.

X
KUPCASE Function
Converts all letters in an argument to uppercase.

X
KUPDATE Function
Inserts, deletes, and replaces character value contents.

X
KUPDATEB Function
Inserts, deletes, and replaces the contents of the character value according to the byte position of the character value in the argument.

X
KVERIFY Function
Returns the position of the first character that is unique to an expression.

X
LEFT
Left-aligns a character string. X

LENGTH
Returns the length of a non-blank character string, excluding trailing blanks, and returns 1 for a blank character string. X

LENGTHC
Returns the length of a character string, including trailing blanks.

X
LENGTHM
Returns the amount of memory (in bytes) that is allocated for a character string.

X
LENGTHN
Returns the length of a character string, excluding trailing blanks. X

LOWCASE
Converts all letters in an argument to lowercase.

X
MISSING
Returns a numeric result that indicates whether the argument contains a missing value.

X
NLITERAL
Converts a character string that you specify to a SAS name literal.

X
NOTALNUM
Searches a character string for a non-alphanumeric character, and returns the first position at which the character is found.

X
NOTALPHA
Searches a character string for a nonalphabetic character, and returns the first position at which the character is found.

X
NOTCNTRL
Searches a character string for a character that is not a control character, and returns the first position at which that character is found.

X
NOTDIGIT
Searches a character string for any character that is not a digit, and returns the first position at which that character is found.

X
NOTFIRST
Searches a character string for an invalid first character in a SAS variable name under VALIDVARNAME=V7, and returns the first position at which that character is found.

X
NOTGRAPH
Searches a character string for a non-graphical character, and returns the first position at which that character is found.

X
NOTLOWER
Searches a character string for a character that is not a lowercase letter, and returns the first position at which that character is found.

X
NOTNAME
Searches a character string for an invalid character in a SAS variable name under VALIDVARNAME=V7, and returns the first position at which that character is found.

X
NOTPRINT
Searches a character string for a nonprintable character, and returns the first position at which that character is found. X

NOTPUNCT
Searches a character string for a character that is not a punctuation character, and returns the first position at which that character is found. X

NOTSPACE
Searches a character string for a character that is not a white-space character (blank, horizontal and vertical tab, carriage return, line feed, and form feed), and returns the first position at which that character is found. X

NOTUPPER
Searches a character string for a character that is not an uppercase letter, and returns the first position at which that character is found.

X
NOTXDIGIT
Searches a character string for a character that is not a hexadecimal character, and returns the first position at which that character is found.

X
NVALID
Checks the validity of a character string for use as a SAS variable name. X

PROPCASE
Converts all words in an argument to proper case.

X
QUOTE
Adds double quotation marks to a character value.

X
RANK
Returns the position of a character in the ASCII or EBCDIC collating sequence. X

REPEAT
Returns a character value that consists of the first argument repeated n+1 times.
X
REVERSE
Reverses a character string. X

RIGHT
Right-aligns a character expression. X

SCAN
Returns the nth word from a character string. X

SOUNDEX
Encodes a string to facilitate searching. X

SPEDIS
Determines the likelihood of two words matching, expressed as the asymmetric spelling distance between the two words. X

STRIP
Returns a character string with all leading and trailing blanks removed. X

SUBPAD
Returns a substring that has a length you specify, using blank padding if necessary.
X
SUBSTR
Extracts a substring from an argument. X

SUBSTRN
Returns a substring, allowing a result with a length of zero.
X
TRANSLATE
Replaces specific characters in a character string. X

TRANTAB
Transcodes data by using the specified translation table. X

TRANWRD
Replaces or removes all occurrences of a substring in a character string.

X
TRIM
Removes trailing blanks from a character string, and returns one blank if the string is missing. X

TRIMN
Removes trailing blanks from character expressions, and returns a string with a length of zero if the expression is missing. X

UPCASE
Converts all letters in an argument to uppercase.

X
URLDECODE
Returns a string that was decoded using the URL escape syntax.

X
URLENCODE
Returns a string that was encoded using the URL escape syntax.

X
VERIFY
Returns the position of the first character in a string that is not in any of several other strings. X

Previous Page | Next Page | Top of Page