You are here: Appendixes>Encodings

DataFlux Data Management Studio 2.5: User Guide

Encodings

In most cases, you will select Default from the Encoding drop-down menu. The following table below explains the options available with the Encoding drop-down menu:

Option Character Set Encoding Constant Description
hp-roman8 Latin 19 An 8-bit Latin character set.
IBM437 Latin 32 Original character set of the IBM PC. Also known as CP437.
IBM850 Western Europe 33 A code page used in Western Europe. Also referred to as MS-DOS Code Page 850.
IBM1047 EBCDIC Latin 1 10 A code page used for Latin 1.
ISO-8859-1 Latin 1 1 A standard Latin alphabet character set.
ISO-8859-2 Latin 2 2 8-bit character sets for Western alphabetic languages such as Latin, Cyrillic, Arabic, Hebrew, and Greek. Commonly referred to as Latin 2.
ISO-8859-3 Latin 3 13 8-bit character encoding. Formerly used to cover Turkish, Maltese, and Esperanto. Also known as "South European".
ISO-8859-4 Latin 4 14 8-bit character encoding originally used for Estonian, Latvian, Lithuanian, Greenlandic, and Sami. Also known as "North European".
ISO-8859-5 Latin/Cyrillic 3 Cyrillic is an 8-bit character set that can be used for Bulgarian, Belarusian, and Russian.
ISO-8859-6 Latin/Arabic 9 This is an 8-bit Arabic (limited) character set.
ISO-8859-7 Latin/Greek 4 An 8-bit character encoding covering the modern Greek language along with mathematical symbols derived from Greek.
ISO-8859-8 Latin/Hebrew 11 Contains all of the Hebrew letter without Hebrew vowel signs. Commonly known as MIME.
ISO-8859-9 Turkish 5 This 8-bit character set covers Turkic and Icelandic. Also known as Latin-5.
ISO-8859-10 Nordic 15 An 8-bit character set designed for Nordic languages. Also known as Latin-6.
ISO-8859-11 Latin/Thai 6 An 8-bit character set covering Thai. May also use TIS-620.
ISO-8859-13 Baltic 16 An 8-bit character set covering Baltic languages. Also known as Latin-7 or "Baltic Rim".
ISO-8859-14 Celtic 17 An 8-bit character set covering Celtic languages like Gaelic, Welsh, and Breton. Known as Latin-8 or Celtic.
ISO-8859-15 Latin 9 18 An 8-bit character set for English, French, German, Spanish, and Portuguese, as well as other Western European languages.
KOI8-R Russian 12 An 8-bit character set covering Russian.
Shift-JIS Japanese   Based on character sets for single-byte and double-byte characters. Also known as JIS X 0208.
TIS-620 Thai 20 A character set used for the Thai language.
UCS-2BE Big Endian 7 Means that the highest order byte is stored at the highest address. This is similar to UTF-16.
UCS-2LE Little Endian 8 Means the lowest order byte of a number is stored in memory at the lowest address. This is similar to UTF-16.
US-ASCII ASCII 31 ASCII (American Standard Code for Information Interchange) is a character set based on the English alphabet.
UTF-8 Unicode   An 8-bit variable length character set for Unicode.
Windows-874 Windows Thai 21 Microsoft Windows Thai code pagecharacter set.
Windows-1250 Windows Latin 2 22 Windows code page representing Central European languages like Polish, Czech, Slovak, Hungarian, Slovene, Croatian, Romanian, and Albanian. This option can also be used for German.
Windows-1251   23  
Windows-1252 Windows Latin 1 24 Nearly identical with Windows-1250.
Windows-1253 Windows Greek 25 A Windows code page used for modern Greek.
Windows-1254 Windows Turkish 26 Represents the Turkish Windows code page.
Windows-1255 Windows Hebrew 27 This code page is used to write Hebrew.
Windows-1256 Windows Arabic 28 This Windows code page is used to write Arabic in Microsoft Windows.
Windows-1257 Windows Baltic 29 Used to write Estonian, Latvian, and Lithuanian languages in Microsoft Windows.
Windows-1258 Windows Vietnamese 30 This code page is used to write Vietnamese text.

Documentation Feedback: yourturn@sas.com
Note: Always include the Doc ID when providing documentation feedback.

Doc ID: dfU_Encodings.html