Option
|
Character Set
|
Encoding Constant
|
Description
|
---|---|---|---|
hp-roman8
|
Latin
|
19
|
An 8-bit Latin character
set.
|
IBM437
|
Latin
|
32
|
Original character set
of the IBM PC. Also known as CP437.
|
IBM850
|
Western Europe
|
33
|
A code page used in
Western Europe. Also referred to as MS-DOS Code Page 850.
|
IBM1047
|
EBCDIC Latin 1
|
10
|
A code page used for
Latin 1.
|
ISO-8859-1
|
Latin 1
|
1
|
A standard Latin alphabet
character set.
|
ISO-8859-2
|
Latin 2
|
2
|
An 8-bit character sets
for Western alphabetic languages such as Latin, Cyrillic, Arabic,
Hebrew, and Greek. Commonly referred to as Latin 2.
|
ISO-8859-3
|
Latin 3
|
13
|
An 8-bit character encoding.
Formerly used to cover Turkish, Maltese, and Esperanto. Also known
as "South European".
|
ISO-8859-4
|
Latin 4
|
14
|
An 8-bit character encoding
originally used for Estonian, Latvian, Lithuanian, Greenlandic, and
Sami. Also known as "North European".
|
ISO-8859-5
|
Latin/Cyrillic
|
3
|
Cyrillic is an 8-bit
character set that can be used for Bulgarian, Belarusian, and Russian.
|
ISO-8859-6
|
Latin/Arabic
|
9
|
This is an 8-bit Arabic
(limited) character set.
|
ISO-8859-7
|
Latin/Greek
|
4
|
This is an 8-bit Arabic
(limited) character set.
|
ISO-8859-8
|
Latin/Hebrew
|
11
|
Contains all of the
Hebrew letter without Hebrew vowel signs. Commonly known as MIME.
|
ISO-8859-9
|
Turkish
|
5
|
This 8-bit character
set covers Turkic and Icelandic. Also known as Latin-5.
|
ISO-8859-10
|
Nordic
|
15
|
An 8-bit character set
designed for Nordic languages. Also known as Latin-6.
|
ISO-8859-11
|
Latin/Thai
|
6
|
An 8-bit character set
covering Thai. Might also use TIS-620.
|
ISO-8859-13
|
Baltic
|
16
|
An 8-bit character set
covering Baltic languages. Also known as Latin-7 or "Baltic Rim".
|
ISO-8859-14
|
Celtic
|
17
|
An 8-bit character set
covering Celtic languages like Gaelic, Welsh, and Breton. Known as
Latin-8 or Celtic.
|
ISO-8859-15
|
Latin 9
|
18
|
An 8-bit character set
for English, French, German, Spanish, and Portuguese, as well as other
Western European languages.
|
KOI8-R
|
Russian
|
12
|
An 8-bit character set
covering Russian.
|
Shift-JIS
|
Japanese
|
|
Based on character sets
for single-byte and double-byte characters. Also known as JIS X 0208.
|
TIS-620
|
Thai
|
20
|
A character set used
for the Thai language.
|
UCS-2BE
|
Big Endian
|
7
|
Means that the highest
order byte is stored at the highest address. This is similar to UTF-16.
|
UCS-2LE
|
Little Endian
|
8
|
Means the lowest order
byte of a number is stored in memory at the lowest address. This is
similar to UTF-16.
|
US-ASCII
|
ASCII
|
31
|
ASCII (American Standard
Code for Information Interchange) is a character set based on the
English alphabet.
|
UTF-8
|
Unicode
|
|
An 8-bit variable length
character set for Unicode.
|
Windows-874
|
Windows Thai
|
21
|
Microsoft Windows Thai
code page character set.
|
Windows-1250
|
Windows Latin 2
|
22
|
Windows code page representing
Central European languages like Polish, Czech, Slovak, Hungarian,
Slovene, Croatian, Romanian, and Albanian. This option can also be
used for German.
|
Windows-1251
|
|
23
|
|
Windows-1252
|
Windows Latin 1
|
24
|
Nearly identical with
Windows-1250.
|
Windows-1253
|
Windows Greek
|
25
|
A Windows code page
used for modern Greek.
|
Windows-1254
|
Windows Turkish
|
26
|
Represents the Turkish
Windows code page.
|
Windows-1255
|
Windows Hebrew
|
27
|
This code page is used
to write Hebrew.
|
Windows-1256
|
Windows Arabic
|
28
|
This Windows code page
is used to write Arabic in Microsoft Windows.
|
Windows-1257
|
Windows Baltic
|
29
|
Used to write Estonian,
Latvian, and Lithuanian languages in Microsoft Windows.
|
Windows-1258
|
Windows Vietnamese
|
30
|
This code page is used
to write Vietnamese text.
|