Data Conversions and Encodings

An encoding maps each character in a character set to a unique numeric representation, which results in a table of code points. A single character can have different numeric representations in different encodings. For example, the ASCII encoding for the dollar symbol $ is 24 hexadecimal. The Danish EBCDIC encoding for the dollar symbol $ is 67 hexadecimal. In order for a version of SAS that typically uses ASCII to properly interpret a data set that is encoded in Danish EBCDIC, the data must be transcoded.
Transcoding is the process of moving data from one encoding to another. When SAS is transcoding the ASCII dollar sign to the Danish EBCDIC dollar sign, the hexadecimal representation for the character is converted from the value 24 to the value 67.
To learn the encoding of a particular SAS data set for SAS 9 and later:
  1. Locate the data set with SAS Explorer.
  2. Right-click the data set.
  3. Select Properties from the menu.
  4. Click the Details tab.
  5. The encoding of the data set is listed, along with other information.
Here are several situations where data might commonly be transcoded:
  • when you share data between two different SAS sessions that are running in different locales or in different operating environments
  • when you perform text-string operations, such as converting to uppercase or lowercase
  • when you display or print characters from another language
  • when you copy and paste data between SAS sessions that are running in different locales
For more information about SAS features that are designed to handle Transcoding for NLS from different encodings or operating environments, see SAS National Language Support (NLS): Reference Guide.