Examples of Transcoding and National Language Support

These examples illustrate some cases in which character transcoding is necessary. Each example follows the steps for implementing transcoding with the formatters.

Example 1: Displaying data that include Greek characters

If your data, in its native state, includes Greek characters and uses the Windows Greek encoding (Windows-1253), you may want to transcode this data to ensure that the Greek characters display properly.

In this example, you will be transcoding from Windows-1253 encoding to ISO 8859-7. Look for a transcoding list that addresses your needs. To find the appropriate list name, look at the tables of provided transcoding lists. The tables are organized by the encoding you are transcoding to, so look for transcoding to ISO 8859.

In the Description column locate the name of the encoding, which is ISO 8859-7 for this example. Find the description that is Win cp1253-greek to ISO 8859/7-greek. The Entry Name associated with this description is WGRKGREK. This is the transcoding list that you should use.

When you run the formatter, you have to specify the name of the transcoding list. In your SAS code specify the list name using the TRANLIST argument:

   tranlist=sashelp.htmlnls.wgrkgrek.slist

You may also want to specify the encoding using the CHARSET argument:

   charset=iso-8859-7

Example 2: Making an HTML file FTP-safe

You have a data set that contains data in ISO Latin 1 encoding and you want to display the data using this encoding, but you need to move the HTML file between EBCDIC and ASCII systems. To ensure the data integrity during the FTP process, you need the file to be FTP-safe. An FTP-safe HTML file contains only ASCII characters in the range 0x00 - 0x7F. HTML files that you transcode using the transcoding lists supplied with the HTML Formatters are FTP-safe.

In its native state, this data has the correct encoding for display in the browser. Because you will not be transcoding from one encoding to another, use SASHELP.HTMLGEN.IDENTITY.SLIST as your transcoding list. (Note that the IDENTITY transcoding list is in the SASHELP.HTMLGEN catalog not the SASHELP.HTMLNLS catalog where all other lists reside.) The IDENTITY transcoding list does a one-to-one transcoding for all characters corresponding to ASCII 0x80 or higher. For example, all occurrences of the character ñ (0xF1, decimal 241) are transcoded to ñ.

You may also want to specify the encoding using the CHARSET argument:

   charset=iso-8859-1

Example 3: Making HTML files on EBCDIC systems

Your SAS data set resides on z/OS and contains characters not in the 7-bit ASCII character set such as capital A with a ring above (Å). The data uses the IBM EBCDIC Code Page 037. You expect to FTP the HTML file from your z/OS system to a PC where the Web server is running. Transcoding will guarantee that the resulting HTML file is FTP-safe and that the EBCDIC to ASCII conversion is performed properly. (See Example 2 for a discussion of FTP-safe files.)

For this example, you are transcoding from IBM EBCDIC Code Page 037 to ISO 8859-1 (Latin 1) encoding. By checking the tables of provided transcoding lists, you determine that you need to use the E037LAT1 transcoding list. (The description of this list is EBCDIC cp037-us to ISO 8859/1-latin1.) In your SAS code, specify the list name using the TRANLIST argument:

   tranlist=sashelp.htmlnls.e037lat1.slist

You may also want to specify the encoding using the CHARSET argument:

   charset=iso-8859-1

Example 4: Displaying SAS Monospace font line-drawing characters

Most encodings do not include line drawing characters. The SAS Monospace font and Unicode both include these characters. If you run a procedure in using SAS software on the PC, the output may rely on the SAS Monospace font for line-drawing characters. If you want to publish the resulting HTML file on the Web, each client that displays the file must have access to the SAS Monospace font to properly display the line-drawing characters.

If your users have Web browsers that support Unicode, you can transcode the data and ensure that the line-drawing characters display properly. You must also specify the character set name in the HTML file. By transcoding your data and specifying the character set name, you eliminate the need for the SAS Monospace font to be installed on each client that displays this HTML page.

To achieve this, the user would need to use a transcoding list as well as specify the character set. Since there are no national characters in the data, this is a general example of transcoding rather than national language support.

To transcode from the SAS Monospace encoding to Unicode, use the SLT1UNIC transcoding list. In your SAS code, specify the list name using the TRANLIST argument:

    tranlist=sashelp.htmlnls.slt1unic.slist

You also need to specify the encoding using the CHARSET argument:

   charset=utf-8

Example 5: Creating a custom transcoding list when one is not provided

You have determined that you need to transcode your output from IBM EBCDIC Code Page 285 (English/UK) to ISO 8859-1 (Latin 1). However, no transcoding list for these encodings exist. Before you can transcode the output, you must create a transcoding list. To create the transcoding list, you must have an appropriate data set or TRANTAB entry before running the MAKETL macro.

In this example, we create the necessary data set; however, you could create or use a TRANTAB entry. The data set provides a mapping for the characters in the native state to the characters in the displayable state.

The following mappings are from the data set for this example:

Character Encoding in
native state
Encoding in
displayable state
exclamation mark 0x5A 0x21
Latin capital letter A 0xC1 0x41
Latin capital letter A with ring above 0x67 0xC5

In the above example mapping, the Latin capital letter A with ring above is transcoded from 0x67 to 0xC5. For a complete list of the values in our data set, see the code that creates the data set.

After you create the data set, run the MAKETL macro to create your transcoding list. To create this transcoding list, submit the following:

   %maketl(tranlist=sasuser.htmlgen.e285lat1.slist,
        desc=EBCDIC cp285-uk to ISO 8859/1-latin1,
        data=e285lat1, 
        from=e285,
        to=lat1);

To transcode from IBM EBCDIC Code Page 285 (English/UK) to ISO 8859-1 (Latin 1), specify the transcoding list name in your SAS code. Use the TRANLIST argument:

    tranlist=sasuser.htmlgen.e285lat1.slist

You may also want to specify the encoding using the CHARSET argument:

   charset=iso-8859-1

________

For more information about character transcoding, see the following topics:

Note: z/OS is the successor to the OS/390 and MVS operating systems. SAS/IntrNet 9.1 for z/OS is supported on the MVS, OS/390, and z/OS operating systems and, throughout this document, any reference to z/OS also applies to OS/390 and MVS, unless otherwise stated.