UNICODEC Function
converts characters in the current SAS session encoding
to Unicode characters.
Syntax
STR=UNICODEC(<instr> (,<Unicode
type> ))
Required Arguments
- str
-
data string that has
been converted to Unicode encoding.
- instr
-
input data string.
- Unicode type
-
Unicode character formats
ESC |
Unicode Escape (for example, \u0042 ) ESC is the
default format.
|
NCR |
Numeric Character Representation (for example, 大
or ± ; )
|
PAREN |
Unicode Parenthesis Escape (for example, <u0061>) |
UCS2 |
UCS2 encoding with native endian. |
UCS2B |
UCS2 encoding with big endian. |
UCS2L |
UCS2 encoding with little endian. |
UCS4 |
UCS4 encoding with native endian. |
UCS4B |
UCS4 encoding with big endian. |
UCS4L |
UCS4 encoding with little endian. |
UTF16 |
UTF16 encoding with big endian. |
UTF16B |
UTF16 encoding with big endian. |
UTF16L |
UTF16 encoding with little endian. |
UTF8 |
UTF8 encoding. |
Details
This function
reads characters that are in the current SAS session encoding and
converts them to Unicode encoding.
Example
The following
example demonstrates the functionality of the UNICODEC function:
length str4 $20;
dai=unicode('\u5927');
str1=unicodec("ABC");
str2=unicodec("ABC",'esc');
str3=unicodec(dai, 'ncr');
str4=unicodec("ab",'paren');
str5=unicodec(dai, 'ucs2');
str6=unicodec(dai, 'ucs2b');
str7=unicodec(dai, 'ucs2l');
str8=unicodec(dai, 'ucs4');
str9=unicodec(dai, 'ucs4b');
str10=unicodec(dai, 'ucs4l');
str11=unicodec(dai, 'utf8');
str12=unicodec(dai, 'utf16');
str13=unicodec(dai, 'utf16b');
str14=unicodec(dai, 'utf16l');
Results:
str1=414243
str2=414243
str3=
str4=str5=2759
str6=5927
str7=2759
str8=27590000
str9=00005927
str10=27590000
str11=E5A4A7
str12=2759
str13=5927
str14=2759