ENCODING= Data Set Option

Overrides the encoding to use for reading or writing a SAS data set.
Valid in: DATA step and PROC steps
Category: Data Set Control

Syntax

ENCODING= ANY | ASCIIANY | EBCDICANY | encoding-value

Syntax Description

ANY
specifies that no transcoding occurs.
Note:ANY is a synonym for binary. Because the data is binary, the actual encoding is irrelevant.
ASCIIANY
specifies that no transcoding occurs when the mixed encodings are ASCII encodings.
EBCDICANY
specifies that no transcoding occurs when the mixed encodings are EBCDIC encodings.
encoding-value
specifies an encoding value.

Details

The value for ENCODING= indicates that the SAS data set has a different encoding from the current session encoding. When you read data from a data set, SAS transcodes the data from the specified encoding to the session encoding. When you write data to a data set, SAS transcodes the data from the session encoding to the specified encoding.
Input Processing
By default, encoding for input processing is determined as follows:
  • If the session encoding and the encoding that is specified in the file are different, SAS transcodes the data to the session encoding.
  • If a file has no encoding specified, but the file's data representation is different from the encoding of the current session, then SAS transcodes the data to the current session.
Output Processing
By default, encoding for output processing is determined as follows:
  • Data is written to a file using the encoding of the current session, except when a different output representation is specified using the OUTREP= data set option, the OUTENCODING= option in the LIBNAME statement, or the ENCODING= data set option.
  • If a new file replaces an existing file, then the new file inherits the encoding of the existing file.
  • If an existing file is replaced by a new file that was created under a different operating environment or that has no encoding specified, the new file uses the encoding of the current session.
Note: Character metadata and data output appears garbled if you specify a different encoding from where the data set was created.In this example, the data set to be printed is internally encoded as ASCII, however the data set option specifies an EBCDIC encoding. SAS attempts to transcode the data from EBCDIC to ASCII, but the data is already in ASCII. The result is garbled data.
data a;
x=1;
abc='abc';
run'
proc print data=a (encoding=”ebcdic”);
run;
Note: The following values for ENCODING= are invalid:
  • UCS2
  • UCS4
  • UTF16
  • UTF32

Comparisons

  • Session encoding is specified using the ENCODING= system option or the LOCALE= system option, with each operating environment having a default encoding.
  • You can specify encoding for a SAS library by using the LIBNAME statement's INENCODING= option (for input files) and the OUTENCODING= option (for output files). If both the LIBNAME statement option and the ENCODING= data set option are specified, SAS uses the data set option.

Examples

Example 1: Creating a SAS Data Set with Mixed Encodings and with Transcoding Suppressed

By specifying the data set option ENCODING=ANY, you can create a SAS data set that contains mixed encodings, and suppress transcoding for either input or output processing.
In this example, the new data set MYFILES.MIXED contains some data that uses the Latin1 encoding, and some data that uses the Latin2 encoding. When the data set is processed, no transcoding occurs. For example, the correct Latin1 characters in a Latin1 session encoding and correct Latin2 characters in a Latin2 session encoding are displayed.
libname myfiles 'SAS data-library';
data myfiles.mixed (encoding=any);
   set work.latin1;
   set work.latin2;
run;

Example 2: Creating a SAS Data Set with a Particular Encoding

For output processing, you can override the current session encoding. This action might be necessary, for example, if the normal access to the file uses a different session encoding.
For example, if the current session encoding is Wlatin1, you can specify ENCODING=WLATIN2 in order to create the data set that uses the encoding Wlatin2. The following statements tell SAS to write the data to the new data set using the Wlatin2 encoding instead of the session encoding. The encoding is also specified in the descriptor portion of the file.
libname myfiles 'SAS data-library';
data myfiles.difencoding (encoding=wlatin2);
      .
      .
      .
run;

Example 3: Overriding Encoding for Input Processing

For input processing, you can override the encoding that is specified in the file, and specify a different encoding.
For this example, the current session encoding is EBCDIC-870, but the file has the encoding value EBCDIC-1047 in the descriptor information. By specifying ENCODING=EBCDIC-870, SAS does not transcode the data, but instead displays the data using EBCDIC-870 encoding.
proc print data=myfiles.mixed (encoding=ebcdic870);
run;

See Also

Conceptual discussion in Encoding for NLS
Options in Statements and Commands: