The SAS language and
various SAS user interfaces enable the user to specify the names associated
with operating system (OS) resources such as
z/OS data set names and
UFS paths. SAS also displays OS resource names as part of SAS output,
in messages to the SAS log, in various windows of the SAS windowing
environment, and so on.
The
z/OS resource names
are maintained by
z/OS as a sequence of binary code points, rather
than as characters. In other words, on
z/OS, the application programming
interfaces do not associate a character encoding with
z/OS resource
names. The same is true for the user interfaces associated with
z/OS
components such as JES (JCL), ISPF, the UNIX System Services (USS)
shell, and so on. When a
z/OS resource name is created, the encoding
used to specify the name is not stored or saved with the name itself.
The following
z/OS data set name illustrates how some code points
are associated with different characters by different EBCDIC code
pages:
PROD.ACCT#104.RAWDATA1
PROD.ACCTÄ104.RAWDATA2
DDDC4CCCE7FFF4DCECCEC3
7964B1333B104B91641314
1 |
the
data set name as it is represented in EBCDIC 1047
|
2 |
the
data set name as it is represented in EBCDIC 1143
|
3 |
the
first hexadecimal character of the code points for the data set name
|
4 |
the
second hexadecimal character of the code points for the data set name
|
The tenth code point
in the data set name is
X’7B’
.
This code point corresponds to the # character in the
U.
S. English code page (EBCDIC 1047). However,
the code page associated with the Finnish code page (EBCDIC 1143)
maps the character Ä to the code point
X’7B’
.
Therefore, the sequence of characters required to identify this OS
resource is different for the EBCDIC 1047 and EBCDIC 1143 encodings.
The difference in the
sequence of characters is illustrated in the following example, which
reads the external file residing in the
z/OS data set. A portion
of the SAS log is shown for functionally equivalent programs that
were run with SAS 9.3 in the EBCDIC 1047 and EBCDIC 1143 encodings.
The following SAS log
excerpt shows the EBCDIC 1047 encoding:
5 proc options option=encoding; run;
ENCODING=OPEN_ED-1047
Specifies default encoding for internal processing of data
6 filename rawdata 'prod.acct#104.rawdata';1
7 data acct;
8 account = 104;
9 infile rawdata;
10 attrib transdate format=date9. informat=date9.;
11 input transdate category $ amount;2
12 run;
NOTE: The infile RAWDATA is:
Dsname=PROD.ACCT#104.RAWDATA,3
Unit=3390,Volume=SDS012,Disp=SHR,Blksize=27920,
Lrecl=80,Recfm=FB,Creation=2011/06/09
1 |
The
highlighted character # in the data set name is associated with the
code point X’7B’ in the OPEN_ED-1047
encoding.
|
2 |
The
highlighted character $ in the INPUT statement is part of SAS syntax.
Syntactical meaning is based on the character instead of the code
point.
|
3 |
The
highlighted character # in the data set name is associated with the
code point X’7B’ in the OPEN_ED-1047
encoding.
|
The following SAS log
excerpt shows the EBCDIC 1143 encoding:
5 proc options option=encoding; run;
ENCODING=OPEN_ED-1143
Specifies default encoding for internal processing of data
6 filename rawdata 'prod.acctÄ104.rawdata';1
7 data acct;
8 account = 104;
9 infile rawdata;
10 attrib transdate format=date9. informat=date9.;
11 input transdate category $ amount;2
12 run;
NOTE: The infile RAWDATA is:
Dsname=PROD.ACCTÄ104.RAWDATA,3
Unit=3390,Volume=SDS012,Disp=SHR,Blksize=27920,
Lrecl=80,Recfm=FB,Creation=2011/06/09
1 |
The
highlighted character Ä in the data set name is associated with
the code point X’7B’ in the
OPEN_ED-1143 encoding.
|
2 |
The
highlighted character $ in the INPUT statement is part of SAS syntax.
Syntactical meaning is based on the character instead of the code
point.
|
3 |
The
highlighted character Ä in the data set name is associated with
the code point X’7B’ in the
OPEN_ED-1143 encoding.
|
In the preceding log
excerpts, the same code point,
X’7B’
,
is specified in the SAS program for the tenth character of the
z/OS
data set name. In addition, the same code point,
X’7B’
,
is used by SAS in the NOTE message to represent the name of the data
set that was read. However, different characters are displayed for
this code point in the two log excerpts because the terminal emulation
for the first log excerpt used a different encoding than the second
log excerpt.
Note that SAS processes
z/OS resource names differently than it does SAS syntax. When NONLSCOMPATMODE
is in effect, SAS syntax is interpreted according to the value of
the ENCODING option. NONLSCOMPATMODE allows the same character to
be specified in SAS syntax regardless of the SAS session encoding
that is in effect. For example, in the first log excerpt, the $ syntax
character in the INPUT statement is encoded as
X’5B’
because
it is in EBCDIC 1047. In the second log excerpt, the $ syntax character
is encoded as
X’67’
because
it is in EBCDIC 1143. However, both SAS sessions properly recognized
these code points as corresponding to the same character, $, because
the ENCODING option informed SAS how to interpret the code points.
In NONLSCOMPATMODE, syntactical meaning is associated with the character,
not a particular code point. NONLSCOMPATMODE is the default value
of the NLSCOMPATMODE system option.
In contrast, because
z/OS resource names have no inherent encoding, it is the string of
code points that identifies the resource to the system, not the associated
characters. The associated characters might vary depending on the
encoding in which the SAS program is prepared.