Character transcoding is the process of translating characters from one encoding to another. Transcoding ensures the integrity of your data when it is displayed in a browser. It provides national language support and enables users to display your HTML files that contain characters not included in the ISO Latin 1 encoding.
You may want to implement transcoding if
Your data may not require transcoding, but it may require a specific encoding (or character set) to display correctly. You can specify the character set in your HTML file to ensure that all browsers attempt to use the necessary encoding.
Transcoding support for the HTML Formatters is offered only with Version 1.1 and higher.
Character transcoding takes data that exists in one encoding and makes that data available in another encoding. You can think of this process as moving data from its native state to its displayable state.
All Web browsers have a default setting for document encoding. Users can change this setting to any encoding supported by their browser. You cannot be sure that all users viewing your HTML pages have the appropriate setting; therefore, you may also need to specify the character set (or encoding) in your HTML file.
To make transcoding work for you, determine the native state of your data and the desired displayable state for your output. Then find or create a transcoding list for the two encodings. The transcoding list acts as a translate table between two encodings. It supplies the Numeric Character References (NCR) that should be used for each character.
When a character is transcoded, it is replaced by its Numeric
Character Reference (NCR). An NCR has the general format
&#nnnnn;
, where nnnnn
is the NCR
decimal value of the character. The formatters include the NCR in
the HTML file if the value of nnnnn
is greater than 127
with one exception:
if nnnnn
is 0, the character is transcoded to
 
, which corresponds to a non-breaking space.
If the NCR value is less than or equal to 127, the formatters put
the actual character in the HTML file.
To help you understand when and how to use character transcoding, see Examples of Transcoding and National Language Support.
To implement character transcoding, complete the following steps:
When you choose the encoding for the resulting HTML file, be sure to verify that the users of the page have browsers that support the selected encoding. For example, Unicode is supported only by the latest browsers, which include version 4.x of Microsoft Internet Explorer and Netscape Communicator.
We provide a variety of transcoding lists that you can use. If we have provided the transcoding list that you need, skip to step 5. If not, continue with the next step.
Create a data set or TRANTAB that contains the necessary transcoding information.
Example 5 shows how to create the necessary data set when there is not an appropriate transcoding list.
Use the MAKETL macro to create your transcoding list.
You may also need to indicate a character set to be used when displaying the HTML page. The character set is specified using the <META> tag in your HTML file. See Specifying the Character Set for more information.
You can perform transcoding when using the
formatters in either batch or interactive mode. If you are
working in interactive mode, be sure to provide a value for
the Transcoding List Name field. You may also want to
complete the Character Set Name entry field. If you are
using the formatters in batch mode, you use the
TRANLIST
and CHARSET
arguments in your
macro call.
tranlist=transcoding-list-name
specifies the name and location of an existing transcoding list. This argument is required only if you are implementing character transcoding. The transcoding list name must be a four-level name, and the fourth level must be SLIST.
The transcoding list can be one of the lists we provided for you, or it can be a transcoding list that you create for your specific needs. You may also need to specify the character set that you want the browser to use when displaying your HTML page.
charset=character-set-name
All Web browsers have a default setting for encoding (or character set). The browser uses the specified encoding to render pages that the user requests. If you, as the page creator, want to override the default setting, you can include the <META> tag with a character set designation at the top of your file.
In order to fully support national characters, the
formatters give
you an easy way to include this tag in your HTML files -- the
CHARSET
argument. If you provide a character set
name in the Character Set Name entry field or by using the
CHARSET
argument, the meta information is added to
the top of your file.
Character set support and names will vary across browsers and even releases of browsers. For this reason, the formatters do not perform any error checking on the value you provide for the character set name. Please check your HTML pages using your target browsers whenever possible.
You might find this list of character set names helpful:
http://www.iana.org/assignments/character-sets.
For more information, see the following topics: