Transcoding Considerations

Although transcoding usually occurs with no problems, there are situations that can affect your data and produce unsatisfactory results. For example:
  • Encodings can conflict with another. That is, two encodings can use different code points for the same character, or use the same code points for two different characters.
  • Characters in one encoding might not be present in another encoding. For example, a specific encoding might not have a character for the dollar sign ($). Transcoding the data to an encoding that does not support the dollar sign would result in the character not printing or displaying.
  • The number of bytes for a character in one encoding can be different from the number of bytes for the same character in another encoding. For example, transcoding from a DBCS to an SBCS. Therefore, transcoding can result in character value truncation.
  • If an error occurs during transcoding such that the data cannot be transcoded back to its original encoding, data can be lost. That is, if you open a data set for update processing, the observation might not be updated. However, if you open the data set for input (read) processing and no output data set is open, SAS issues a warning that can be printed. Processing proceeds and allows a PRINT procedure or other read operation to show the data that does not transcode.
  • CEDA has some processing limitations. For example, CEDA does not support update processing.
  • Incorrect encoding can be stamped on a SAS 7 or SAS 8 data set if it is copied or replaced in a SAS 9 session with a different session encoding from the data. The incorrect encoding stamp can be corrected with the CORRECTENCODING= option in the MODIFY statement in PROC DATASETS. If a character variable contains binary data, transcoding might corrupt the data.