Understanding Character Set Encoding

IMS does not use character sets or code pages and does not transcode data, so the interpretation of the data is done by SAS. Therefore, the IMS engine must transcode all character data going into an IMS database and all character data returned from an IMS database.
The default encoding behavior is as follows:
  • for output processing to a new IMS database (did not previously exist), the data is written to the new database using the current SAS session encoding.
  • for output processing to an existing IMS database, the new data inherits the encoding of the existing data in the database.
  • for input (read) processing, if the SAS session encoding and the encoding on the IMS database are incompatible, the data is transcoded to the session encoding. If the database does not have an encoding, SAS transcodes the data only if the host platform is different.
The SAS/ACCESS interface to IMS supports the ENCODING= data set option in order to override the encoding for processing a specific input or output file. For example, when you are reading an IMS database using an IMS view descriptor, the ENCODING= data set option enables you to specify an encoding that is different from the session encoding. The data is transcoded from the database encoding to the session encoding as the data is read from the IMS database.
   proc print data=imsview (encoding=latin2);
   run;
Some of the reasons that you might want to override encoding behavior by using the ENCODING= data set option are as follows:
  • to create output in an encoding that is different from the current session encoding or that is the encoding for an existing IMS database.
  • to create output that contains mixed encodings.
  • to request that no transcoding occurs.
For more information about the ENCODING= data set option, see the SAS Data Set Options: Reference.