Data Conversions and Encodings

An encoding maps each character in a character set to a unique numeric representation, resulting in a table of all code points. A single character can have different numeric representations in different encodings. For example, the ASCII encoding for the dollar symbol $ is 24 hexadecimal. The Danish EBCDIC encoding for the dollar symbol $ is 67 hexadecimal. In order for a version of SAS that normally uses ASCII to properly interpret a data set that is encoded in Danish EBCDIC, the data must be transcoded.
Transcoding is the process of moving data from one encoding to another. When transcoding the ASCII dollar sign to the Danish EBCDIC dollar sign, the hexadecimal representation for the character is converted from the value 24 to a 67.
If you want to know the encoding of a particular SAS data set, for SAS 9 and above follow these steps:
  1. Locate the data set with SAS Explorer.
  2. Right-click the data set.
  3. Select Properties from the menu.
  4. Click the Details tab.
  5. The encoding of the data set is listed, along with other information.
Some situations where data might commonly be transcoded are:
  • when you share data between two different SAS sessions that are running in different locales or in different operating environments,
  • when you perform text-string operations, such as converting to uppercase or lowercase,
  • when you display or print characters from another language,
  • when you copy and paste data between SAS sessions running in different locales.
For more information about SAS features designed to handle Transcoding for NLS from different encodings or operating environments, see SAS National Language Support (NLS): Reference Guide.