Encoding Behavior in a SAS Session

Encoding Support for Data Sets by SAS Release

For Base SAS files, there are three categories of encoding support, which is based on the version of SAS that created the file:
  • Data sets that are created in SAS 9 automatically have an encoding attribute, which is specified in the descriptor portion of the file. In SAS 9, DBCS recognizes the DBCSTYPE value and converts it to the encoding value and specifies it in the descriptor portion of the field, by default.
  • Data sets that are created in SAS 7 and SAS 8 do not have an encoding value that is specified in the file. It is assumed that SAS 7 and SAS 8 data sets were created in the SAS session encoding of the operating environment. However, the descriptor portion of the file does support an encoding value. When you replace or update a SAS 7 or SAS 8 file in a SAS 9 session, SAS specifies the current session encoding in the descriptor portion of the file, by default. In SAS 8, DBCS has the DBCSTYPE field, instead of the encoding field.
  • Data sets created in SAS 6 do not have an encoding value that is associated with the file and cannot have an encoding value specified in the file.

z/OS: Ensuring Compatibility with Previous SAS Releases

Setting the NLSCOMPATMODE system option ensures compatibility with previous releases of SAS.
Note: NLSCOMPATMODE is supported under the z/OS operating environment only.
Programs that were run in previous releases of SAS will continue to work when NLSCOMPATMODE is specified.
The NONLSCOMPATMODE system option specifies that data is to be processed in the encoding that is set by the ENCODING= option or the LOCALE= option, including reading and writing external data and processing SAS syntax and user data.
Some existing programs that ran in previous releases of SAS will no longer run when NONLSCOMPATMODE is in effect. If you have made character substitutions in SAS syntax statements, you must modify your programs to use national characters. For example, a Finnish customer who has substituted the Å character for the $ character in existing SAS syntax will have to update the program to use the $ in the Finnish environment.

Output Processing

When you create a data set in SAS 9, encoding is determined as follows:
  • If a new output file is created, the data is written to the file using the current session encoding.
  • If a new output file is created using the OUTREP= option, which specifies a data representation that is different from the current session, the data is written to the file using the default session encoding for the operating system that is specified by the OUTREP= value. For more information, see OUTREP= Data Set Option.
  • If a new output file replaces an existing file, the new file inherits the encoding of the existing file. For output processing that replaces an existing file that is from another operating environment or if the existing file has no encoding that is specified in it, then the current session encoding is used.

Input Processing

For input (read) processing in SAS 9, encoding behavior is as follows:
  • Most users will choose the default behavior which is not to specify an encoding for the input file.
  • If the session encoding and the encoding that is specified in the file are incompatible, the data is transcoded to the session encoding. For example, if the current session encoding is ASCII and the encoding that is specified in the file is EBCDIC, SAS transcodes the data from EBCDIC to ASCII.
  • If a file does not have an encoding specified in it, SAS transcodes the data only if the file's data representation is different from the current session.

Reading and Writing External Files

SAS reads and writes external files using the current session encoding. SAS assumes that the external file has the same encoding as the session encoding. For example, if you are creating a new SAS data set by reading an external file, SAS assumes that the encoding of the external file and the current session are the same. If the encodings are not the same, the external data could be written incorrectly to the new SAS data set. For details about the syntax for the SAS statements that perform input and output processing, see SAS Options That Transcode SAS Data.