SAS Institute. The Power to Know

SAS(R) 9.2 National Language Support (NLS): Reference Guide

space
Previous Page | Next Page

Data Set Options for NLS

ENCODING= Data Set Option



Overrides the encoding to use for reading or writing a SAS data set.
Valid in: DATA step and PROC steps
Category: Data Set Control

Syntax
Syntax Description
Details
Comparisons
Examples
Example 1: Creating a SAS Data Set with Mixed Encodings and with Transcoding Suppressed
Example 2: Creating a SAS Data Set with a Particular Encoding
Example 3: Overriding Encoding for Input Processing
See Also

Syntax

ENCODING= ANY | ASCIIANY | EBCDICANY | encoding-value


Syntax Description

ANY

specifies that no transcoding occurs.

Note:   ANY is a synonym for binary. Because the data is binary, the actual encoding is irrelevant.  [cautionend]

ASCIIANY

specifies that no transcoding occurs when the mixed encodings are ASCII encodings.

EBCDICANY

specifies that no transcoding occurs when the mixed encodings are EBCDIC encodings.

encoding-value

specifies an encoding value. For details, see Encoding for NLS.


Details

The value for ENCODING= indicates that the SAS data set has a different encoding from the current session encoding. When you read data from a data set, SAS transcodes the data from the specified encoding to the session encoding. When you write data to a data set, SAS transcodes the data from the session encoding to the specified encoding.

Input Processing

By default, encoding for input processing is determined as follows:

  • If the session encoding and the encoding that is specified in the file are different, SAS transcodes the data to the session encoding.

  • If a file has no encoding specified, but the file's data representation is different from the encoding of the current session, then SAS transcodes the data to the current session.

Output Processing

By default, encoding for output processing is determined as follows:

  • Data is written to a file using the encoding of the current session, except when a different output representation is specified using the OUTREP= data set option, the OUTENCODING= option in the LIBNAME statement, or the ENCODING= data set option.

  • If a new file replaces an existing file, then the new file will inherit the encoding of the existing file.

  • If an existing file is replaced by a new file that was created under a different operating environment or that has no encoding specified, the new file will use the encoding of the current session.

Note:   Character metadata and data output will appear garbled if you specify a different encoding from where the data set was created.

In this example, the data set to be printed is internally encoded as ASCII, however the data set option specifies an EBCIDIC encoding. SAS will attempt to transcode the data from EBCIDIC to ASCII, but the data is already in ASCII. The result will be garbled data.

data a;
x=1;
abc='abc';
run'
proc print data=a (encoding="ebcdic");
run;

  [cautionend]

Note:   The following values for ENCODING= are invalid:

  • UCS2

    UCS4

    UTF16

    UTF32

  [cautionend]

Comparisons

  • Session encoding is specified using the ENCODING= system option or the LOCALE= system option, with each operating environment having a default encoding.

  • You can specify encoding for a SAS library by using the LIBNAME statement's INENCODING= option (for input files) and the OUTENCODING= option (for output files). If both the LIBNAME statement option and the ENCODING= data set option are specified, SAS uses the data set option.


Examples


Example 1: Creating a SAS Data Set with Mixed Encodings and with Transcoding Suppressed

By specifying the data set option ENCODING=ANY, you can create a SAS data set that contains mixed encodings, and suppress transcoding for either input or output processing.

In this example, the new data set MYFILES.MIXED contains some data that uses the Latin1 encoding, and some data that uses the Latin2 encoding. When the data set is processed, no transcoding occurs. For example, you will see correct Latin1 characters in a Latin1 session encoding and correct Latin2 characters in a Latin2 session encoding.

libname myfiles 'SAS data-library';

data myfiles.mixed (encoding=any);
   set work.latin1;
   set work.latin2;
run;


Example 2: Creating a SAS Data Set with a Particular Encoding

For output processing, you can override the current session encoding. This action might be necessary, for example, if the normal access to the file will use a different session encoding.

For example, if the current session encoding is Wlatin1, you can specify ENCODING=WLATIN2 in order to create the data set that uses the encoding Wlatin2. The following statements tell SAS to write the data to the new data set using the Wlatin2 encoding instead of the session encoding. The encoding is also specified in the descriptor portion of the file.

libname myfiles 'SAS data-library';

data myfiles.difencoding (encoding=wlatin2);
      .
      .
      .
run;


Example 3: Overriding Encoding for Input Processing

For input processing, you can override the encoding that is specified in the file, and specify a different encoding.

For this example, the current session encoding is EBCDIC-870, but the file has the encoding value EBCDIC-1047 in the descriptor information. By specifying ENCODING=EBCDIC-870, SAS does not transcode the data, but instead displays the data using EBCDIC-870 encoding.

proc print data=myfiles.mixed (encoding=ebcdic870);
run;


See Also

Conceptual discussion in Encoding for NLS

Data Set Options:

SORTSEQ= Data Set Option

Options in Statements and Commands:

ENCODING= Option

INENCODING= and OUTENCODING= Options

System Options:

ENCODING System Option: OpenVMS, UNIX, Windows, and z/OS

LOCALE System Option

space
Previous Page | Next Page | Top of Page