Moving Data across Environments with Different Encodings

Transcoding

Although it is easy to move data across environments that use the same encoding, it can be more difficult to move data across environments that use different encodings. When the encoding of a file is incompatible with the computer environment's encoding, transcoding occurs.
Transcoding is the process of mapping data from one encoding to another, such as mapping data from an ASCII-based encoding to an EBCDIC-based encoding. Transcoding is not translating from one language to another; transcoding is remapping of characters.
For example, consider a file that was created on a UNIX platform that uses the Latin1 encoding, then moved to an IBM mainframe that uses the Danish EBCDIC encoding. When the file is processed on the IBM mainframe, the data is remapped from the Latin1 encoding to the Danish EBCDIC encoding. If the data contains a dollar sign ($), the hexadecimal number is converted from 24 to 67.
Transcoding can occur in the following situations:
  • when you move a SAS file from one platform to another and the file's encoding is incompatible with the current session encoding. An example might be moving a SAS file from a z/OS operating environment with an EBCDIC-based encoding to a Windows operating environment with an ASCII-based encoding.
  • when you share data between two SAS sessions (like in a client/server environment) that have incompatible session encodings.
  • when you read and write an external file.

How Base SAS Transcodes Data

Base SAS provides transcoding when you move data and applications from one environment to another. To transcode one encoding to another, SAS uses translation tables, like the one that maps Wlatin2 (Windows) to ISO Latin2 (UNIX).
For example, when you
  • use the CPORT and CIMPORT procedures to create a transport file, SAS automatically uses translation tables to transcode one encoding to another and back again. First, the data is converted from the source encoding to transport format, then the data is converted from the transport format to the target encoding.
  • process a SAS data set that has an encoding that is different from the current session encoding, SAS automatically uses CEDA (cross environment data access) software to transcode data. (CEDA is the same software in SAS that converts a SAS file to the correct data representation when you move a file from one platform to another.)