Encoding

The table below explains the options available with the Encoding drop-down list. In most cases, you will select Default from the Encoding drop-down list.
Option
Character Set
Encoding Constant
Description
hp-roman8
Latin
19
An 8-bit Latin character set.
IBM437
Latin
32
Original character set of the IBM PC. Also known as CP437.
IBM850
Western Europe
33
A code page used in Western Europe. Also referred to as MS-DOS Code Page 850.
IBM1047
EBCDIC Latin 1
10
A code page used for Latin 1.
ISO-8859-1
Latin 1
1
A standard Latin alphabet character set.
ISO-8859-2
Latin 2
2
An 8-bit character sets for Western alphabetic languages such as Latin, Cyrillic, Arabic, Hebrew, and Greek. Commonly referred to as Latin 2.
ISO-8859-3
Latin 3
13
An 8-bit character encoding. Formerly used to cover Turkish, Maltese, and Esperanto. Also known as "South European".
ISO-8859-4
Latin 4
14
An 8-bit character encoding originally used for Estonian, Latvian, Lithuanian, Greenlandic, and Sami. Also known as "North European".
ISO-8859-5
Latin/Cyrillic
3
Cyrillic is an 8-bit character set that can be used for Bulgarian, Belarusian, and Russian.
ISO-8859-6
Latin/Arabic
9
This is an 8-bit Arabic (limited) character set.
ISO-8859-7
Latin/Greek
4
This is an 8-bit Arabic (limited) character set.
ISO-8859-8
Latin/Hebrew
11
Contains all of the Hebrew letter without Hebrew vowel signs. Commonly known as MIME.
ISO-8859-9
Turkish
5
This 8-bit character set covers Turkic and Icelandic. Also known as Latin-5.
ISO-8859-10
Nordic
15
An 8-bit character set designed for Nordic languages. Also known as Latin-6.
ISO-8859-11
Latin/Thai
6
An 8-bit character set covering Thai. Might also use TIS-620.
ISO-8859-13
Baltic
16
An 8-bit character set covering Baltic languages. Also known as Latin-7 or "Baltic Rim".
ISO-8859-14
Celtic
17
An 8-bit character set covering Celtic languages like Gaelic, Welsh, and Breton. Known as Latin-8 or Celtic.
ISO-8859-15
Latin 9
18
An 8-bit character set for English, French, German, Spanish, and Portuguese, as well as other Western European languages.
KOI8-R
Russian
12
An 8-bit character set covering Russian.
Shift-JIS
Japanese
Based on character sets for single-byte and double-byte characters. Also known as JIS X 0208.
TIS-620
Thai
20
A character set used for the Thai language.
UCS-2BE
Big Endian
7
Means that the highest order byte is stored at the highest address. This is similar to UTF-16.
UCS-2LE
Little Endian
8
Means the lowest order byte of a number is stored in memory at the lowest address. This is similar to UTF-16.
US-ASCII
ASCII
31
ASCII (American Standard Code for Information Interchange) is a character set based on the English alphabet.
UTF-8
Unicode
An 8-bit variable length character set for Unicode.
Windows-874
Windows Thai
21
Microsoft Windows Thai code page character set.
Windows-1250
Windows Latin 2
22
Windows code page representing Central European languages like Polish, Czech, Slovak, Hungarian, Slovene, Croatian, Romanian, and Albanian. This option can also be used for German.
Windows-1251
23
Windows-1252
Windows Latin 1
24
Nearly identical with Windows-1250.
Windows-1253
Windows Greek
25
A Windows code page used for modern Greek.
Windows-1254
Windows Turkish
26
Represents the Turkish Windows code page.
Windows-1255
Windows Hebrew
27
This code page is used to write Hebrew.
Windows-1256
Windows Arabic
28
This Windows code page is used to write Arabic in Microsoft Windows.
Windows-1257
Windows Baltic
29
Used to write Estonian, Latvian, and Lithuanian languages in Microsoft Windows.
Windows-1258
Windows Vietnamese
30
This code page is used to write Vietnamese text.