I/O Functions

Introduction

The SAS/C library provides a large set of input/output functions, which are divided into two groups, standard-style I/O functions and UNIX style I/O functions. This chapter describes these functions and how they are used.

The following section describes how to perform input and output using the functions provided in the SAS/C library. This section is important if you use SAS/C I/O facilities, whether you are developing new programs or porting existing programs from other environments.

In addition to the traditional C I/O facilities described in this section, the library offers for both CMS and MVS a set of functions to perform low-level I/O, making direct use of native I/O facilities. These facilities are described in Chapter 2, "CMS Low-Level I/O Functions," and Chapter 3, "MVS Low-Level I/O Functions," in SAS/C Library Reference, Volume 2

The library's I/O implementation is designed to

support the ISO/ANSI C standard
support the execution of existing programs developed with other C implementations
support the development of new portable programs
support the effective use of native MVS and CMS I/O facilities and file types.

As described later in this chapter, the library provides several I/O techniques to meet the needs of different applications. To achieve the best results, you must make an informed choice about the techniques to use. Criteria that should influence this choice are

the need for portability (For instance, will the program execute on several different systems?)
the required capabilities (For instance, will the program need to alter the file position randomly during processing?)
the need for efficiency (For instance, can the program accept some restrictions on file format to achieve good performance?)
the intended use of the files (For instance, will files produced by the program later be processed by an editor or by a program written in another language?).

To make these choices, you need to understand general C I/O concepts as well as the native I/O types and file structures supported by the 370 operating systems, MVS and CMS. These topics are addressed in this chapter. The description is aimed primarily at the knowledgeable C programmer who should be familiar with 370 I/O concepts. In many cases, understanding the 370 I/O concepts is necessary to control and anticipate program behavior. Where possible, this chapter addresses these issues, but familiarizing yourself with 370 I/O concepts using other sources is highly recommended. Chapter 1, "Introduction," of the SAS/C Compiler and Library User's Guide, Fourth Edition lists the documents from International Business Machines Corporation that may be of particular value.

Some parts of this chapter are intended for knowledgeable 370 programmers who may be interested in the relationship between SAS/C I/O and traditional 370 I/O techniques. These portions are identified as such, and you can skip them if you do not have the necessary background in 370 I/O concepts.

This chapter is divided into two sections: technical background and technical summaries. For the most effective use of SAS/C I/O techniques, you should become familiar with the concepts presented in the next section, "Technical Background." Skim Technical Summaries for information relevant to your application, and consult specific I/O function descriptions for details on the functions. Much of the material in the last two sections is reference information of limited applicability, but understanding the technical background section is essential for effective use of the library I/O functions.

Technical Background

This section provides a fairly in-depth summary of the fundamentals of C I/O. It begins with a discussion of traditional C I/O concepts, then discusses UNIX low-level, ISO/ANSI, and IBM 370 I/O concepts. These concepts are combined in SAS/C I/O Concepts and 370 Perspectives on SAS/C Library I/O . The final section provides guidelines for choosing an I/O method, based on the needs of your application.

Traditional C (UNIX) I/O Concepts

When C was initially designed, no library, and therefore no I/O, was included. It was assumed that libraries suitable for use with particular systems would be developed. Because most early use of the C language was associated with UNIX operating systems, the UNIX I/O functions were considered the standard I/O method for C. As the C language has evolved, the I/O definition has changed to some extent, but understanding the underlying UNIX concepts is still important.

In addition, many useful C programs were first developed under UNIX operating systems, and such programs frequently are unaware of the existence of other systems or I/O techniques. Such programs cannot run on systems as different from UNIX as CMS or MVS without carefully considering their original environment.

The UNIX I/O model

The main features of the UNIX I/O model are as follows:

A file is a sequence of characters. A file contains no information other than these characters. It is possible to create a file containing no characters.
A file is divided into lines by the new-line character ('\n'). New-line characters have no other special properties. A file may contain lines of any length, including 0.
The characters in a file are numbered sequentially, starting at 0. It is possible to position a file efficiently at any particular character.
No arbitrary restrictions are imposed on the lengths of lines in a file or on the size of a file. Padding characters are never written to fill out a file or a line to a particular length or boundary.
Files can be opened for reading, writing, or both. When a file is opened for writing, the previous contents can optionally be erased. After the file is opened, characters can be replaced but not removed. That is, the end-of-file position can be advanced but not moved backwards.

UNIX Low-Level I/O

One complication in programs developed under UNIX operating systems is that UNIX defines two different I/O interfaces: standard I/O and low-level I/O (sometimes called unbuffered I/O). Standard I/O is a more portable form of I/O than low-level I/O, and UNIX documentation recommends that portable programs be written using this form. However, UNIX low-level I/O is widely recognized as more efficient than standard I/O, and it provides some additional capabilities, such as the ability to test whether a file exists before it is opened. For these and other reasons, many programs use low-level I/O, despite its documented lack of portability.

UNIX operating systems also support a mixed-level form of I/O, wherein a file is accessed simultaneously with standard I/O and low-level I/O. C implementations that support the UNIX low-level functions may be unable to support mixed-level I/O, if the two forms of I/O are not closely related in the UNIX manner.

UNIX low-level I/O is not included in the ISO/ANSI C standard, so it may be unavailable with recently developed C compilers. Also, do not assume that this form of I/O is truly low-level on any system other than UNIX.

ISO/ANSI C I/O Concepts

The definition of the C I/O library contained in the ISO/ANSI C standard is based on the traditional UNIX standard I/O definition, but differs from it in many ways. These differences exist to support efficient I/O implementations on systems other than UNIX, and to provide some functionality not offered by UNIX. In general, where definitions of I/O routines differ between ISO/ANSI C and UNIX C, programs should assume the ISO/ANSI definitions for maximum portability. The ISO/ANSI definitions are designed for use on many systems including UNIX, while the applicability of the UNIX definitions is more limited.

Text access and binary access

In the UNIX I/O model, files are divided into lines by the new-line character ('\n'). For this reason, C programs that process input files one line at a time traditionally read characters until a new-line character is encountered. Similarly, programs that write output one line at a time write a new-line character after each line of data.

Many systems other than UNIX use other conventions for separating lines of text. For instance, the IBM PC operating system, PC DOS, separates lines of text with two characters, a carriage return followed by a line feed. The IBM 370 uses yet another method. To enable a line-oriented C program written for UNIX to execute under PC DOS, a C implementation must translate a carriage return and line feed to a new-line character on input, and must translate a new-line character to a carriage return and line feed on output. Although this translation is appropriate for a line-oriented program, it is not appropriate for other programs. For instance, a program that writes object code to a file cannot tolerate replacement of a new-line character in its output by a carriage return and a line feed. For this reason, most systems other than UNIX require two distinct forms of file access: text access and binary access.

The ISO/ANSI I/O definition requires that when a program opens a file, it must specify whether the file is to be accessed as a text stream or a binary stream. When a file is accessed as a binary stream, the implementation must read or write the characters without modification. When a file is accessed as a text stream, the implementation must present the file to the program as a series of lines separated by new-line characters, even if a new-line character is not used by the system as a physical line separator. Thus, under PC DOS, when a program writes a file using a binary stream, any new-line characters in the output data are written to the output file without modification. But when a program writes a file using a text stream, a new-line character in the output data is replaced by a carriage return and a line feed to serve as a standard PC DOS line separator.

If a file contains a real new-line character (one that is not a line separator) and the file is read as a text stream, the program will probably misinterpret the new-line character as a line separator. Similarly, a program that writes a carriage return to a text stream may generate a line separator unintentionally. For this reason, the ISO/ANSI library definition leaves the results undefined when any nonprintable characters (other than horizontal tab, vertical tab, form feed, and the new-line character) are read from or written to a text stream. Therefore, text access should be used only for files that truly contain text, that is, lines of printable data.

Programs that open a file without explicitly specifying binary access are assumed to require text access, because the formats of binary data, such as object code, vary widely from system to system. Thus, portable programs are more likely to require text access than binary access.

Padding

Many non UNIX file systems require files to consist of one or more data blocks of a fixed size. In these systems, the number of characters stored in a file must be a multiple of this block size. This requirement can present problems for programs that need to read or write arbitrary amounts of data unrelated to the block size; however, it is not a problem for text streams. When a text stream is used, the implementation can use a control character to indicate the logical end of file. This approach cannot be used with a binary stream, because the implementation must pass all data in the file to the program, whether it has control characters or not.

The ISO/ANSI C library definition deals with fixed data blocks by permitting output files accessed as binary streams to be padded with null ('\0') characters. This padding permits systems that use fixed-size data blocks to always write blocks of the correct size. Because of the possibility of padding, files created with binary streams on such systems may contain one or more null characters after the last character written by the program. Programs that use binary streams and require an exact end-of-file indication must write their own end-of-file marker (which may be a control character or sequence of control characters) to be portable.

A similar padding concern can occur with text access. Some systems support files where all lines must be the same length. (Files defined under MVS or CMS with record format F are of this sort.) ISO/ANSI permits the implementation to pad output lines with blanks when these files are written and to remove the blanks at the end of lines when the files are read. (A blank is used in place of a null character, because text access requires a printable padding character.) Therefore, portable programs write lines containing trailing blanks and expect to read the blanks back if the file will be processed later as input.

Similarly, some systems (such as CMS) support only nonempty lines. Again, ISO/ANSI permits padding to circumvent such system limitations. When a text stream is written, the Standard permits the implementation to write a line containing a single blank, rather than one containing no characters, provided that this line is always read back as one containing no characters. Therefore, portable programs should distinguish empty lines from ones that contain a single blank.

Finally, some systems (such as CMS) do not permit files containing no characters. A program is nonportable if it assumes a file can be created merely by opening it and closing it, without writing any characters.

File positioning with fseek and ftell

As stated earlier, the UNIX I/O definition features seeking by character number. For instance, it is possible to position directly to the 10,000th character of a file. On a system where text access and binary access are different, the meaning of a request to seek to the 10,000th character of a text stream is not well defined. ftell and fseek enable you to obtain the current file position and return to that position, no matter how the system implements text and binary access.

Consider a system such as PC DOS, where the combination of carriage return and line feed is used as a line separator. Because of the translation, a program that counts the characters it reads is likely to determine a different character position from the position maintained by the operating system. (A line that the program interprets as n characters, including a final new-line character, is known by the operating system to contain n+1 characters.)

Some systems, such as the 370 operating systems, do not record physical characters to indicate line breaks. Consider a file on such a system composed of two lines of data, the first containing the single character 1 and the second containing the single character 2. A program accessing this file as a text stream receives the characters 1\n2\n . The program must process four characters, although only two are physically present in the file. A request to position to the second character is ambiguous. The library cannot determine whether the next character read should be \n or 2.

Even if you resolve the ambiguity of file positioning in favor of portability (by counting the characters seen by the program rather than physical characters), implementation difficulties may preclude seeking to characters by number using a text stream. Under PC DOS, the only way to seek accurately to the 10,000th character of a file is to read 10,000 characters because the number of carriage return and line feed pairs in the file is not known in advance. If the file is opened for both reading and writing, replacing a printable character with a new-line character requires replacing one physical character with two. This replacement requires rewriting the entire file after the point of change. Such difficulties make it impractical on many systems to seek for text streams based on a character number.

Situations such as those discussed in this section show that on most systems where text and binary access are not identical, positioning in a text stream by character number cannot be implemented easily. Therefore, the ISO/ANSI standard permits a library to implement random access to a text stream using some indicator of file position other than character number. For instance, a file position may be defined as a value derived from the line number and the offset of the character in the line.

File positions in text streams cannot be used arithmetically. For instance, you cannot assume that adding 1 to the position of a particular character results in the position of the next character. Such file positions can be used only as tokens. This means that you can obtain the current file position (using the ftell function) and later return to that position (using the fseek function), but no other portable use of the file position is possible.

This change from UNIX behavior applies only to text streams. When you use fseek and ftell with a binary stream, the ISO/ANSI standard still requires that the file position be the physical character number.

File positioning with fgetpos and fsetpos

Even with the liberal definition of random access to a text stream given in the previous section, implementation of random access can present major problems for a file system that is very different from that of a traditional UNIX file system. The traditional MVS file system is an example of such a system. To assist users of these file systems, the Standard includes two non UNIX functions, fsetpos and fgetpos.

File systems like the MVS file system have two difficulties implementing random access in the UNIX (ISO/ANSI binary) fashion:

They do not record character-oriented position information. For many MVS files, such as those with record format VB, a request to position to the 10,000th character can be satisfied only by positioning to the first character and then reading until 10,000 characters have been read. (To determine the number of characters in the file, it is necessary to read the entire file.)
Some files may contain more characters than the largest possible long int value. Because UNIX operating systems and the Standard define the file position to have type long int, random access to all such enormous files cannot be supported. The functions fgetpos and fsetpos are defined by the Standard to perform operations similar to those of fseek and ftell, except that the representation of a file position is completely implementation-defined. This allows an implementation to choose a representation for the file position that is large enough to address all characters of the largest possible file and that can take into account all the idiosyncrasies of the host operating system. (For example, the file position may reference a disk track number rather than a record number or byte number.) Thus, using fgetpos and fsetpos for random access produces the greatest likelihood that a program will run on a system dissimilar to UNIX.

The fsetpos and fgetpos functions did not exist prior to the definition of the ISO/ANSI C standard. Because many C libraries have not yet implemented them, they are at this time less portable than fseek and ftell, which are compatible with UNIX operating systems.

However, it is a relatively straightforward task to implement them as macros that call fseek and ftell in such systems. After these macros have been written, fsetpos and fgetpos are essentially as portable as their UNIX counterparts and will offer substantial additional functionality where provided by the library on systems such as MVS.

The ISO/ANSI I/O model

The following list describes the I/O model for ISO/ANSI C. The points are listed in the same order as the corresponding points for the UNIX I/O model, as presented in the previous section.

A file may be processed in one of two ways: as a text stream, or as a binary stream. When a file is processed as a binary stream, it appears to the program as a sequence of characters. It may not be possible to create a file containing no characters.
A file accessed as a text stream appears to the program as a sequence of lines separated by occurrences of the new-line character ('\n'). The effects of reading or writing control characters using a text stream are not predictable. An implementation is permitted to record line separators using some technique other than physical new-line characters.
When a file is accessed as a binary stream, its characters are numbered sequentially starting at 0. It is possible to position a binary stream to any particular character. When a file is accessed as a text stream, its characters are addressable, but not necessarily by a physical character number. It is possible to position a file accessed as a text stream to any character, provided the address of that character was obtained at the time of some previous access to that character.
An implementation may restrict the size of lines or files. The implementation may pad files accessed as a binary stream with null characters at the end of the file, and it may pad files accessed as a text stream with blanks at the end of each line.
Files can be opened for reading or writing, or both. When a file is opened for writing, the previous contents can optionally be erased. It is undefined whether writing a character before the end of file shortens the length of the file or leaves it unchanged.

IBM 370 I/O Concepts

Programmers accustomed to other systems frequently find the unique nature of 370 I/O confusing. This section organizes the most significant information about 370 I/O for SAS/C users. Note that this description is general rather than specific. Details and complex special cases are generally omitted to avoid obscuring the basic principles. See the introduction to the SAS/C Compiler and Library User's Guide, Fourth Edition for a small bibliography of relevant IBM publications that should be consulted for additional information.

Fundamental principles

There are two 370 operating systems of interest, MVS and CMS. They implement different file systems. (CMS also implements OS simulation, which emulates MVS I/O under CMS. The emulation is not perfect and is actually a third I/O implementation.)
Many file systems feature exactly one kind of file. For instance, in UNIX all files are simply sequences of characters. The 370 operating systems, especially MVS, go to the opposite extreme and handle many different types of files, each with its own peculiarities and uses. In general, the programmer must decide during program design which sorts of files a program will use.
370 I/O is record oriented. That is, files are treated as sequences of records, not sequences of characters. The idea that a physical character or character sequence may be used as a record or line separator is completely alien to the 370 systems. (An analogy that may be helpful is that UNIX operating systems and PC DOS treat line-oriented files as virtual terminals, with lines separated by layout characters such as the new-line and form feed characters. The 370 systems handle files as if they were virtual card decks consisting of physical records separated by gaps.)
Most file systems allow the same program to replace old data in a file and to add new data at the end. In general, 370 I/O does not permit you to mix these two kinds of updates within the same program. When a file is opened using a technique that permits the addition of new data, the replacement of old data generally causes any following data to be discarded.
370 I/O is hardware oriented. It uses physical disk addresses to encode file positions. Under MVS, you cannot address records efficiently, even with a record number. For common file types, you must use an actual disk address to position to a record without reading from the start of a file.
Another aspect of the hardware orientation of 370 I/O is the large number of file attributes that must be assigned, either by the program or by the user. Many of these attributes have no effect other than to alter the physical layout of the data. Such attributes are defined for the sole purpose of enabling the programmer to trade off various aspects of program performance. For example, you can permit a program to execute faster by using more memory for buffer space. In some cases, the ability to tailor these attributes is vital, but frequently the programmer is forced to make such choices when performance is not an important consideration.
The 370 file systems are lacking in disk space management. This means that programs must deal with the inability to enlarge files. It also means that users must provide size estimates to the system when files are created. It is necessary with some commonly used file types to run utilities to reclaim wasted file space. These problems are most notable under MVS, but they can also be a factor under CMS.
For programmers accustomed to the UNIX file system, the conventions for 370 file naming may seem strange. Under MVS, filenames are often given only as indirect names (DDnames in MVS jargon) that can be connected to actual filenames only by the use of a control language. (It is possible to refer to a file by its actual name rather than a DDname, but the absence of directories and reliable user identification under MVS make this an inconvenient and often difficult technique.) Under CMS, either DDnames or more natural filenames can be used, but some programs choose to use DDnames to achieve closer compatibility with MVS.

File organizations under MVS

Under MVS, files are classified first by file organization. A number of different organizations are defined, each tailored for an expected type of usage. For instance, files with sequential organization are oriented towards processing records in sequential order, while most files with VSAM (Virtual Storage Access Method) organization are oriented toward processing based on key fields in the data.

For each file organization, there is a corresponding MVS access method for processing such files. (An MVS access method is a collection of routines that can be called by a program to perform I/O.) For instance, files with sequential organization are normally processed with the Basic Sequential Access Method (BSAM). Sometimes, a file can be processed in more than one way. For example, files with direct organization can be processed either with BSAM or with the Basic Direct Access Method (BDAM).

The file organizations of most interest to C programmers are sequential and partitioned. The remainder of this section relates primarily to these file organizations, but many of the considerations apply equally to the others. A number of additional considerations apply specifically to files with partitioned organization. These considerations are summarized in MVS partitioned data sets.

Note: An important type of MVS file, the Virtual Storage Access Method (VSAM) file, was omitted from the previous list. VSAM files are organized as records identified by a character string or a binary key. Because these files differ so greatly from the expected C file organization, they are difficult to access using standard C functions. Because of the importance of VSAM files in the MVS environment, full access to them is provided by nonportable extensions to the standard C library.

Note: Also, if your system supports OpenEdition MVS, it provides a hierarchical file system similar to the system offered on UNIX. The behavior of files in the hierarchical file system is described in UNIX Low-Level I/O . Only traditional MVS file behavior is described here.

The characteristics of a sequential or partitioned file are defined by a set of attributes called data control block (DCB) parameters. The three DCB parameters of most interest are record format (RECFM), logical record length (LRECL), and block size (BLKSIZE).

As stated earlier, MVS files are stored as a sequence of records. To improve I/O performance, records are usually combined into blocks before they are written to a device. The record format of a file describes how record lengths are allowed to vary and how records are combined into blocks. The logical record length of a file is the maximum length of any record in a file, possibly including control information. The block size of a file is the maximum size of a block of data.

The three primary record formats for files are F (fixed), V (variable), and U (undefined). Files with record format F contain records that are all of equal length. Files with format V or U may contain records of different lengths. (The differences between V and U are mostly technical.) Files of both F and V format are frequently used; the preferred format for specific kinds of data (for instance, program source) varies from site to site.

Ideally, the DCB parameters for a file are not relevant to the C program that processes it, but sometimes a C program has to vary its processing based on the format of a file, or to require a file to have a particular format. Some of the reasons for this are as follows:

Because most C programs do not write lines of equal length, a C library implementation must add trailing blanks to the end of output lines in a record format F file and remove them on input. If this is inappropriate for an application, you may need to require the use of a record format V or U file, or to use a nonportable function to inhibit library padding.
When writing to a file with a small logical record length as a text stream, the library may be forced to divide a long line into several records. In this case, when the file is read, the data are not identical to what was written.
Some programs and system utilities require specific DCB attributes. For instance, the MVS linkage editor cannot handle object files whose block size is greater than 3200 bytes. C programs producing input for such programs must be aware of these requirements.
One of the secondary DCB attributes a file can have is the ANSI control characters (RECFM=A) option, which means that the first character position of each record will be used as a FORTRAN carriage control character. The UNIX convention of using characters such as form feed and carriage return to create page formatting can be used only when the output file is defined to use ANSI control characters. Since some editors do not allow such files to be edited, it is generally not appropriate to assign this attribute to all files.
The standard C language does not provide any way for you to interrogate or define file attributes. In cases in which a program depends on file attribute information, you have two choices. You can use control language when files are created or used to define the file attributes, or you can use nonportable mechanisms to access or specify this information during execution.

File organizations under CMS

Like most operating systems, CMS has its own native file system. (In fact, it has two: the traditional minidisk file system and the more hierarchical shared file system.) Unlike most operating systems, CMS has the ability to simulate the file systems of other IBM operating systems, notably OS and VSE. Also, CMS can transfer data between users in spool files with the VM control program (CP).

Therefore, CMS files are classified first by the type of I/O simulation (or lack thereof) used to read or write to them. The three types are

CMS-format files, which are read and written by native CMS I/O support. This category includes spool files (virtual reader and printer files) and CMS disk files, either mini-disk based or in the shared file system.
OS-format files, particularly MACLIBs and TXTLIBs (simulated OS PDS's) and OS files on OS disks. These files are read and written by the CMS simulation of OS BSAM and other OS access methods.
VSE-format files, particularly VSAM files, including VSAM files on OS or VSE disks. These files are read and written by the VSE implementation of VSAM under CMS.

CMS I/O simulation can be used to read files created by OS or VSE, but these operating systems cannot read files created by CMS, even when the files are created using CMS's simulation of their I/O system. In general, CMS adequately simulates OS and VSE file organizations, and the rules that apply in the real operating system also apply under CMS. However, the simulation is not exact. CMS's simulation differs in some details and some facilities are not supported at all.

CMS-format files, particularly disk files, are of most interest to C programmers. CMS disk files have a logical record length (LRECL) and a record format (RECFM). The LRECL is the length of the largest record; it may vary between 1 and 65,535. The RECFM may be F (for files with fixed-length records) or V (for files with variable-length records). Other file attributes are handled transparently under CMS. Files are grouped by minidisk, a logical representation of a physical direct-access device. The attributes of the minidisk, such as writability and block size, apply to the files it contains. Files in the shared file system are organized into directories, conceptually similar to UNIX directories.

Records in RECFM F files must all have the same LRECL. The LRECL is assigned when the file is created and may not be changed. Some CMS commands require that input data be in a RECFM F file. To support RECFM F files, a C implementation must either pad or split lines of output data to conform to the LRECL, and remove the padding from input records.

RECFM V files have records of varying length. The LRECL is the length of the longest record in the file, so it may be changed at any time by appending a new record that is longer than any other record. However, changing the record length of RECFM V files causes any following records to be erased. The length of any given record can be determined only by reading the record. (Note that the CMS LRECL concept is different from the MVS concept for V format files, as the LRECL under MVS includes extra bytes used for control information.)

Some rules apply for both RECFM F and RECFM V files. Records in CMS files contain only data. No control information is embedded in the records. Records may be updated without causing loss of data. Files may be read sequentially or accessed randomly by record number.

As under MVS, files that are intended to be printed reserve the first character of each record for an ANSI carriage control character. Under CMS, these files can be given a filetype of LISTING, which is recognized and treated specially by commands such as PRINT. If a C program writes layout characters, such as form feeds or carriage returns, to a file to effect page formatting, the file should have the filetype LISTING to ensure proper interpretation by CMS.

Be aware that the standard C language does not provide any way for you to interrogate or define file attributes. In cases in which a program depends on file attribute information, you have two choices. You can use the FILEDEF command to define file attributes (if your program uses DDnames), or you can use nonportable mechanisms to access or specify this information during execution.

MVS partitioned data sets

As stated earlier, one of the important MVS file organizations is the partitioned organization. A file with partitioned organization is more commonly called a partitioned data set (PDS) or a library. A PDS is a collection of sequential files, called members, all of which share the same area of disk space. Each member has an eight-character member name. Under MVS, source and object modules are usually stored as PDS members. Also, almost any other sort of data may be stored as a PDS member rather than as an ordinary sequential file.

Partitioned data sets have several properties that make them particularly difficult for programs that were written for other file systems to handle:

It is not possible to add data to the end of a PDS member. Because each member is usually adjacent to the end of the previous member on the disk, adding data to the end of one member would destroy the next one. To change the size of a PDS member, it usually is necessary to copy and rewrite the entire member.
Members are always added to a PDS at the end of the file. For this reason, it is impossible to write to two members of the same PDS at the same time, as this causes the two members to overlap randomly.
When a member is replaced in a PDS, the space used by any previous member with the same name is not reclaimed. This makes PDS's particularly susceptible to running out of space. It is necessary to run a system utility to reclaim unused space in a PDS.
A member does not always occupy the same spot in a PDS. Because PDS file positions are represented relative to the start of the entire PDS, file positions may differ between identical copies of the same data, even if all file attributes are identical.

These limitations may cause ISO/ANSI-conforming programs to fail when they use PDS members as input or output files. For instance, it is reasonable for a program to assume that it can append data to the end of a file. But due to the nature of PDS members, it is not feasible for a C implementation to support this, except by saving a copy of the member and then replacing the member with the copy. Although this technique is viable, it is very inefficient in both time and disk space. (This tradeoff between poor performance and reduced functionality is one that must be faced frequently when using C I/O on the 370. PDS members, which are perhaps the most commonly used kind of MVS file, are the most prominent examples of such a tradeoff.)

Note: Recent versions of MVS support an extended form of PDS, called a PDSE. Some of the previously described restrictions on a PDS do not apply to a PDSE. For example, unused space is reclaimed automatically in a PDSE.

CMS MACLIBs and TXTLIBs

Two important OS-simulated file types on CMS are the files known as MACLIBs and TXTLIBs. Both of these are simulations of OS-partitioned data sets. MACLIBs are typically used to collect textual data or source code; TXTLIBs may contain only object code. Unlike OS PDS's, these files always have fixed-length, 80-character records.

In general, MACLIBs and TXTLIBs may not be written by OS-simulated I/O. Instead, data are added or removed a member at a time by CMS commands. Input from MACLIBs and TXTLIBs can be performed using either OS-simulation or native CMS I/O.

Identifying files

In UNIX operating systems and similar systems, files are identified in programs by name, and program flexibility with files is achieved by organizing files into directories. Files with the same name may appear in several directories, and the use of a command language to establish working directories enables the user of a program to define program input and output flexibly at run time.

In the traditional MVS file system, all files occupy a single name space. (This is an oversimplification, but a necessary one.) Programs that open files by a physical filename are limited to the use of exactly one file at a site. You can use several techniques to increase program flexibility in this area, none of which is completely satisfactory. These techniques include the following:

Specify filenames in TSO format. When the time-sharing option of MVS (TSO) is used, each user's files usually begin with a userid, thereby ensuring that the filenames chosen by different users do not overlap. By convention, a user running under TSO can omit the userid from a filename specification. This helps considerably for those programs that always run interactively and never in batch mode. However, userid is a TSO concept and, unless a site uses optional software (such as an IBM or other vendor security system), programs cannot be associated with a userid when running in batch.
Specify filenames as DDnames. Under MVS-batch, using DDnames to identify files is traditional. A DDname is an indirect name associated with an actual filename or device addressed by a DD statement in batch or an ALLOCATE command under TSO. Programs that use DDnames to identify files are completely flexible. They can produce printed output, terminal output, or disk output, depending only on their control language. Unfortunately, control language must always be used, because there are no default file definitions.
Because most traditional filenames include periods, which are not permitted in DDnames, programs from other environments may need to be modified if they are to use DDnames, and if the logic of the program will withstand such a change.
Determine filenames dynamically at run time rather than putting them in the program. For instance, you may get filenames from the user or from a profile or configuration file. This is the most flexible technique, but it may require extensive program changes.

Under CMS, you can use other techniques to increase program flexibility:

The concept of the CMS minidisk replaces the UNIX directory concept. However, CMS minidisks are not arranged hierarchically, as UNIX directories are arranged. CMS minidisks are not identified by name or device address but by filemode letter, which is assigned by using the CMS ACCESS command and can be changed at any time. (Because the same filename may exist on several minidisks, it may be necessary to include a filemode letter in a filename to make it unambiguous.) In many ways, the minidisk with filemode letter A corresponds to the UNIX working directory, but this analogy is only approximate.
CMS filenames use spaces in filenames rather than periods. This is not a problem, because it is natural for a C library to treat the filename xyz.c as
XYZ C under CMS.
The CMS shared file system is hierarchically arranged, so there is often a natural correspondence between a UNIX pathname and a shared filename. Unfortunately, the differing character conventions of CMS and UNIX will generally inhibit a UNIX oriented program from running unchanged with the shared file system. For example, the UNIX pathname /tools/asm/main.c is the same as the shared filename MAIN C TOOLS.ASM.
CMS supports using DDnames for filenames instead of physical filenames. This feature allows programs to be easily ported between MVS and CMS. The file referred to by a DDname must be defined by using the CMS FILEDEF command before a program that uses the DDname is executed.

File existence

Under MVS, the concept of file existence is not nearly so clear-cut as on other systems, due primarily to the use of DDnames and control language. Since DDnames are indirect filenames, the actual filename must be provided through control language before the start of program execution. If the file does not already exist at the time the DD statement or ALLOCATE command is processed, it is created at that time. Therefore, a file accessed with a DDname must already exist before program execution.

An alternate interpretation of file existence under MVS that avoids this problem is to declare that a file exists after a program has opened it for output. By this interpretation, a file created by control language immediately before execution does not yet exist. Unfortunately, this definition of existence cannot be implemented because of the following technicalities:

MVS does not distinguish in the catalog or Volume Table of Contents (VTOC) between a newly created file that has never been written and one that has been written but is empty (contains no characters).
Attempting to read a file that has never been written produces random results because MVS does not erase disk space when it is allocated or freed. This makes it impossible to distinguish an empty file from a newly created file by trying to read it.

A third interpretation of existence is to say that an MVS file exists if it contains any data (as recorded in the VTOC). This has the disadvantage of making it impossible to read an empty file but the much stronger advantage that a file created by control language immediately before program execution is perceived as not existing. This is the interpretation used in the SAS/C implementation.

This ambiguity about the meaning of existence applies only to files with sequential organization. For files with partitioned organization, only the file as a whole is created by control language; the individual members are created by program action. This means that existence has its natural meaning for PDS members, and that it is possible to create a PDS member containing no characters.

CMS does not allow the existence of files containing no characters, and it is not possible to create such a file.

Miscellaneous differences from UNIX operating systems

The following section lists some additional features of UNIX operating systems and UNIX I/O that some programmers expect to be available on the 370 systems. These features are generally foreign to the 370 environment and impractical to implement. Code that expects the availability of these features is not portable to the 370 no matter how successfully it runs on other architectures.

UNIX operating systems and many other systems support single-character unbuffered terminal I/O in which characters can be read from a terminal one at a time and may not appear on the screen until echoed by the program. This sort of full-duplex protocol is not supported by 370 terminal controllers or operating systems.
Many programs assume that screen formatting is controlled by standard control sequences, such as those used by the DEC VT100 and similar terminals. The common 370 terminal architecture (the 3270 family) bears no similarities whatsoever to that of terminals commonly used with UNIX operating systems. Although MVS and CMS support the use of terminals similar to the VT100, they are not commonly used and are not supported well enough to make running UNIX full-screen applications on them a viable proposition.
The 370 operating systems offer little or no support for the use of files by more than one program simultaneously. Programs that want to do file sharing must issue system calls to synchronize with each other and obey a number of restrictions in the way the shared files are used. Because common system programs such as compilers, linkers, and copy utilities do not attempt to synchronize in this way, attempting to share files with these programs is unsafe.
There is no MVS or CMS concept corresponding to the pipe. Data are usually passed from program to program by means of temporary files.
In general, the size of a file cannot be determined in any way other than by reading the entire file. The MVS and CMS equivalents of directories and inodes record the file size in terms of either the number of records or the hardware address of the end of file.
In UNIX operating systems and many other systems, the time at which a file was last written or accessed can be determined easily. Under MVS, this information is not recorded. For PDS members, popular editors frequently store such information in a control area of the file, but this information is both difficult to access and not reliable, because updates by programs that do not support this feature (such as linkers and copy utilities) do not maintain the data appropriately.

Summary of 370 I/O characteristics

The following list describes the characteristics of 370 files (without any special reference to the C language). The points are listed in the same order as the corresponding points for the UNIX and ISO/ANSI I/O models as presented earlier:

Many different kinds of files are possible. In general, files are not simply sequences of characters, as an additional structure is imposed by grouping the characters of a file into records. Whether a file can contain no characters depends on the file type.
The records of a file are separated by logical or physical gaps. Control characters have no special significance and never serve as record or line separators.
It is not possible to position a file to a particular character. Usually, it is possible to position efficiently to a particular record, but records are frequently identified by hardware-oriented addresses rather than by record numbers.
Most files have restrictions on record length and file size, depending on their attributes. It is frequently necessary to write padding characters to force a file to conform to these attributes.
Files can be opened for reading or writing or both. Usually, it is not possible to open a file so that new characters can be added and old characters replaced. It depends on file type and how it is accessed whether replacing an existing character truncates the file or leaves its length unchanged.

SAS/C I/O Concepts

In an ideal C implementation, C I/O would possess all three of the following properties:

It would be compatible with UNIX operating systems.
It would be efficient.
It would work with all kinds of files.

For the reasons detailed in IBM 370 I/O Concepts , C I/O on the 370 cannot support all three of these properties simultaneously. The library provides several different kinds of I/O to allow the programmer to select the properties that are most important.

The library offers two separate I/O packages:

Standard I/O is defined according to the ISO/ANSI standard. It is efficient and works with all kinds of files, but in many ways it is not compatible with UNIX operating systems. For files with suitable attributes (as described in the next section, "Standard I/O"), standard I/O is efficient and compatible with UNIX operating systems, but many files are not of this sort, especially files under MVS. Besides the ISO/ANSI standard I/O functions, the library provides a number of augmented functions, which provide non-portable access to mainframe-specific functionality.
UNIX style I/O is compatible with UNIX low-level I/O and supports all types of files, but it is generally not efficient.

Details on both of these I/O packages are presented in the following sections.

Two other I/O packages are provided: CMS low-level I/O, defined for low-level access to CMS disk files, and OS low-level I/O, which performs OS-style sequential I/O. These forms of I/O are nonportable and are discussed in Chapter 2, "CMS Low-Level I/O Functions," and Chapter 3, "MVS Low-Level I/O Functions," in SAS/C Library Reference, Volume 2.

Standard I/O

Standard I/O is implemented by the library in accordance with its definition in the C Standard. A file may be accessed as a binary stream, in which case all characters of the file are presented to the program unchanged. When file access is via a binary stream, all information about the record structure of the file is lost. On the other hand, a file may be accessed as a text stream, in which case record breaks are presented to the program as new-line characters ('\n'). When data are written to a text file and then read, the data may not be identical to what was written because of the need to translate control characters and possibly to pad or split text lines to conform to the attributes of the file.

Besides the I/O functions defined by the Standard, several augmented functions are provided to exploit 370-specific features. For instance, the afopen function is provided to allow the program to specify 370-dependent file attributes, and the afread routine is provided to allow the program to process records that may include control characters. Both standard I/O functions and augmented functions may be used with the same file.

Library access methods

The low-level C library routines that interface with the MVS or CMS physical I/O routines are called C library access methods or access methods for short. (The term MVS access method always refers to access methods such as BSAM, BPAM, and VSAM to avoid confusion.) Standard I/O supports five library access methods: "term", "seq", "rel", "kvs", and "fd".

When a file is opened, the library ordinarily selects the access method to be used. However, when you use the afopen function to open a file, you can specify one of these particular access methods.

The library uses the "term" access method to perform terminal I/O; this access method applies only to terminal files. (See Terminal I/O for more information on this access method.)
The "rel" access method is used for nonterminal files whose attributes permit them to support UNIX file behavior when accessed as binary streams.
The "kvs" access method is used for VSAM files when access is via the SAS/C nonstandard keyed I/O functions. (See Using VSAM Files .)
The "fd" access method is used for files in the OpenEdition hierarchical file system.
The "seq" access method is used with all text streams and for binary streams that cannot support the "rel" access method, except when "fd" is used.

The "rel" access method Under MVS, the "rel" access method can be used for files with sequential organization and RECFM F, FS, or FBS. (The limitation to sequential organization means that the "rel" access method cannot be used to process a PDS member.) Under CMS, the "rel" access method can be used for disk files with RECFM F. The "rel" access method is designed to behave like UNIX disk I/O:

All characters are addressable by their character number. It is possible to position efficiently to any character.
It is possible to replace characters before the end of file and add new data after the end of file without closing and reopening the file. A file never becomes smaller, except when the open call requests that the file's previous contents be discarded.

Because of the nature of the 370 file system, complete UNIX compatibility is not possible. In particular, the following differences still apply:

It is not possible to create a file containing no characters using the "rel" access method.
Padding null characters '(\0)' will be added at the end of file, if necessary to complete a record when the file is closed. If you define a file processed by the "rel" access method to have a record length of 1, you can avoid this padding.

The "kvs" access method The "kvs" access method processes any file opened with the extension open mode "k" (indicating keyed I/O). This access method is discussed in more detail in Using VSAM Files .

The "fd" access method The "fd" access method processes any file residing in the OpenEdition MVS hierarchical file system. These files are fully compatible with UNIX. In files processed with the "fd" access method, there is no difference between text and binary access.

The "seq" access method The "seq" access method processes a nonterminal non OpenEdition file if any one of the following apply:

the file is to be accessed as text
the file is not suitable for "rel" access
the use of the "seq" access method is specifically requested.

In general, the "seq" access method is implemented to use efficient native interfaces, forsaking compatibility with UNIX operating systems where necessary. Some specific incompatibilities are listed here:

The operating system being used and the file type determine whether an empty file can be created.
File positions are represented in a way natural to the file type and the operating system, not as character numbers. The ISO/ANSI fsetpos and fgetpos functions are fully supported, except for certain files with unusual attributes such as multivolume disk files. (See Tables 3.5 and 3.6 for a complete list of restricted file types.)
The fseek and ftell functions are supported only for text streams. This restriction is necessary because the C Standard requires that the file position be defined as a relative character number for binary streams, which cannot be efficiently determined on 370 systems. If an application requires access to binary data by character number, it should be either restricted to using files that can be processed by the "rel" access method, or it should use the UNIX style I/O package.
Padding of lines for a text stream and padding at end of file for a binary stream frequently occurs. The afopen function gives you some control over the way padding is performed.
For some files, changing data within a file causes the file to be truncated at the point of change; that is, all data following the change is lost. This behavior is system and file type dependent. With afopen, the program can inform the library of any dependence on truncation or lack of truncation. For CMS disk files, truncation is optional and you can use afopen to indicate whether truncation should occur.

UNIX style I/O

The library provides UNIX style I/O to meet two separate needs:

to support the same functions as UNIX low-level I/O: open, read, write, lseek, and close. This allows programs that use these functions to run easily with the SAS/C library.
to support seeking by character number for all files whether or not this is convenient and efficient to implement. This allows programs that require this property to execute successfully, although more slowly, with the SAS/C library.
The library connects the use of the UNIX low-level I/O interface and the ability to do seeking by character number because UNIX documentation has traditionally stressed that seeking by character number is not guaranteed when standard I/O is used. The UNIX Version 7 Programmer's Manual states that the file position used by standard I/O "is measured in bytes only on UNIX; on some other systems it is a magic cookie."

As a result of the second property, UNIX style I/O is less efficient than standard I/O for the same file, unless the file is suitable for "rel" access, or it is in the OpenEdition hierarchical file system. In these cases, there is little additional overhead.

For files suitable for "rel" access, UNIX style I/O simply translates I/O requests into corresponding calls to standard I/O routines. Thus, for these files there is no decrease in performance.

For files in the OpenEdition hierarchical file system, UNIX style I/O calls the operating system low-level I/O routines directly. For these files, use of standard I/O by UNIX style I/O is completely avoided.

For other files, UNIX style I/O copies the file to a temporary file using the "rel" access method and then performs all requested I/O to this file. When the file is closed, the temporary file is copied back to the user's file, and the temporary file is then removed. This means that UNIX style I/O for files not suitable for "rel" access has the following characteristics:

The necessity of copying the data makes UNIX style I/O somewhat inefficient. However, after the copying is done, file operations are efficient, except for close of an output file, when all the data must be copied back. As an optimization, input data are copied from the user's file only as necessary, rather than copying all the data when the file is opened.
If there is a system failure while a file is being processed with UNIX style I/O, the file is unchanged, because no data are written to an output file until the file is closed.
It is possible for the processing of a file with UNIX style I/O to fail if there is not enough disk space available to make a temporary copy.
Because UNIX style I/O completely rewrites an output file when the file is closed, file truncation does not occur. That is, characters are not dropped as a result of updates before the end of file.

All of the discussion within this section assumes that the user's file is accessed as a binary file: that is, without reference to any line structure. Occasionally, there are programs that want to use this interface to access a file as a text file. (Most frequently, such programs come from non UNIX environments like the IBM PC.)

As an extension, the library supports using UNIX style I/O to process a file as text. However, file positioning by character number is not supported in this case, and no copying of data takes place. Instead, UNIX style I/O translates I/O requests to calls equivalent to standard I/O routines.

Note that UNIX style I/O represents open files as small integers called file descriptors. Unlike UNIX, with MVS and CMS, file descriptors have no inherent significance. Some UNIX programs assume that certain file descriptors (0, 1, and 2) are always associated with standard input, output, and error files. This assumption is nonportable, but the library attempts to support it where possible. Programs that use file 0 only for input, and files 1 and 2 only for output, and that do not issue seeks to these files, are likely to execute successfully. Programs that use these file numbers in other ways or that mix UNIX style and standard I/O access to these files are likely to fail.

UNIX operating systems follow specific rules when assigning file descriptors to open files. The library follows these rules for OpenEdition files and for sockets. However, MVS or CMS files accessed using UNIX I/O are assigned file descriptors outside of the normal UNIX range to avoid affecting the number of OpenEdition files or sockets the program can open. UNIX programs that use UNIX style I/O to access MVS or CMS files may therefore need to be changed if they require the UNIX algorithm for allocation of file descriptors.

370 Perspectives on SAS/C Library I/O

This section describes SAS/C I/O from a 370 systems programmer's perspective. In contrast to the other parts of this chapter, this section assumes some knowledge of 370 I/O techniques and terminology.

MVS I/O implementation

Under MVS, the five C library access methods are implemented as follows:

The "term" access method uses TPUT ASIS to write to the terminal and TGET EDIT to read from the terminal. Access to SYSTERM in batch is performed using QSAM.
The "seq" access method uses BSAM and BPAM for both input and output. VSAM is used for access to VSAM ESDS and KSDS data sets.
The "rel" access method uses XDAP and BSAM. XDAP is used for input and to update all blocks of the file except the last block. BSAM is used to update the last block of the file or to add new blocks. VSAM is used to access VSAM relative record data sets, and DIV is used to access VSAM linear data sets.
The "kvs" access method uses VSAM for all operations.
The "fd" access method uses OpenEdition service routines for all operations.

Although BDAM is not used by the "rel" access method, direct organization files that are normally processed by BDAM are supported, provided they have fixed-length records and no physical keys.

CMS I/O implementation

The C library access methods are implemented under CMS as follows:

The "term" access method uses TYPLIN or LINEWRT to write to the terminal and WAITRD or LINERD to read from the terminal.
The "seq" access method uses device-dependent techniques. For CMS disk files, it uses FSCB macros (FSREAD, FSWRITE, and so on). For access to shared files, it uses the CSL DMSOPEN, DMSREAD, and DMSWRITE services. For access to shared file system directories, it uses the DMSOPDIR and DMSGETDI services.
For spool files, it uses CMS native macros such as RDCARD. For tape files, filemode 4 disk files, and files on OS disks, it uses simulated MVS BSAM. VSAM KSDS and ESDS data sets are processed using simulated VSE/VSAM.
The "rel" access method uses FSCB macros. Where appropriate, it creates sparse CMS files. VSAM RRDS data sets are processed using simulated VSE VSAM.
The "kvs" access method uses VSE VSAM for all operations.

File attributes for `"rel"` access under MVS

Under MVS, a file can be processed by the "rel" access method if it is not a PDS or PDS member, and if it has RECFM F, FS, or FBS. These record formats ensure that there are no short blocks or unfilled tracks in the file, except the last, and make it possible to reliably convert a character number into a block address (in CCHHR form) for the use of XDAP. Use of "rel" may also be specified for regular files in the OpenEdition file system (in which case the "fd" access method is used).

If the LRECL of an FBS file is 1, then an accurate end-of-file pointer can be maintained without adding any padding characters. Because of the use of BSAM and XDAP to process the file, use of this tiny record size does not affect program efficiency (data are still transferred a block at a time). However, it may lead to inefficient processing of the file by other programs or languages, notably ones that use QSAM.

File attributes for "rel" access under CMS

Under CMS, a file can be processed by the "rel" access method if it is a CMS disk file (not filemode 4) with RECFM F. Use of RECFM F ensures that a character number can be converted reliably to a record number and an offset within the record.

If the LRECL of a RECFM F file is 1, then an accurate end-of-file pointer can be maintained without ever adding any padding characters. Because the file is processed in large blocks (using the multiple record feature of the FSREAD and FSWRITE macros), use of this tiny record size does not affect program efficiency. Nor does it lead to inefficient use of disk space, because the files are physically blocked according to the minidisk block size. However, it may lead to inefficient processing of the file by other programs or languages that process one record at a time.

Temporary files under MVS

Temporary files are created by the library under two circumstances.

They are created if the program calls the tmpfile function.
They are created if the program uses UNIX style I/O and it is necessary to copy a file.

A program can create more than one temporary file during its execution. Each temporary file is assigned a temporary file number, starting sequentially at 1.

One of two methods is used to create the temporary file whose number is nn. First, a check is made for a SYSTMPnn DD statement. If this DDname is allocated and defines a temporary data set, then this data set is associated with the temporary file. If no SYSTMP nn DDname is allocated, the library uses dynamic allocation to create a new temporary file whose data set name depends on the file number. (The system is allowed to select the DDname, so there is no dependency on the SYSTMPnn style of name.) The data set name depends also on information associated with the running C programs, so that several C programs can run in the same address space without conflicts occurring between temporary filenames.

If a program is compiled with the posix compiler option, then temporary files are created in the OpenEdition hierarchical file system, rather than as MVS temporary files. The OpenEdition temporary files are created in the /tmp HFS directory.

Temporary files are normally allocated using a unit name of VIO and a space allocation of 50 tracks. The unit name and default space allocation can be changed by a site, as described in the SAS/C installation instructions. If a particular application requires a larger space allocation than the default, use of a SYSTMPnn DD statement specifying the required amount of space is recommended.

Temporary files under CMS

Temporary files are created by the library under two circumstances.

They are created if the program calls the tmpfile function.
They are created if the program uses UNIX style I/O and it is necessary to copy the file.

A program can create more than one temporary file during its execution. Each temporary file is assigned a temporary file number, starting sequentially at 1.

One of two methods is used to create the temporary file whose number is nn. First, a check is made for a FILEDEF of the DDname SYSTMP nn. If this DDname is defined, then it is associated with the temporary file. If no SYSTMPnn DDname is defined, the library creates a file whose name has the form $$$$$$nn $$$$xxxx, where nn is the temporary file number, and the xxxx part of the filetype is associated with the calling C program. This naming convention allows several C programs to execute simultaneously without conflicts occurring between temporary filenames.

Temporary files are normally created by the library on the write-accessed minidisk with the most available space. Using FILEDEF to define a SYSTMPnn DDname with another filemode allows you to use some other technique if necessary.

Be aware that these temporary files are not known to CMS as temporary files. Therefore, they are not erased if a program terminates abnormally or if the system fails during its execution.

VSAM usage and restrictions

The SAS/C library supports two different kinds of access to VSAM files: standard access, and keyed access. Standard access is used when a VSAM file is opened in text or binary mode, and it is limited to standard C functionality. Keyed access is used when a VSAM file is used in keyed mode. Keyed mode is discussed in detail in Using VSAM Files .

Any kind of VSAM file may be used via standard access. Restrictions apply to particular file types, for example, a KSDS may not be opened for output using standard I/O.

The library supports VSAM ESDS data sets as it supports other sequentially organized file types. A VSAM ESDS can be accessed as a text stream or a binary stream using standard I/O or UNIX style I/O. A VSAM ESDS is not suitable for processing by the "rel" access method because it is not possible, given a character position, to determine the RBA (relative byte address) of the record containing the character.
The library supports VSAM KSDS data sets for input only. Output is not supported for standard access because the C I/O routines are unaware of the existence of keys and cannot guarantee that new records are added in key order. Use keyed access instead. Also, file positioning with fseek or fsetpos is not supported, because records are ordered by key, and it is not possible to transform a key value into the file position formats used for other library file types. When a KSDS is used for input, records are always presented to the program in key order, not in the order of their physical appearance in the data set. Note that KSDS output is available when keyed access is used.
The library supports VSAM RRDS data sets for access via the "rel" access method only. Only RRDS data sets with a fixed record length are supported. As with all other files accessed via "rel", file positioning using fseek and ftell are fully supported.
The library supports VSAM linear data sets that are also known as Data-in-Virtual (DIV) objects. You must access a DIV object as a binary stream, and you must use the "rel" access method. As with all other files accessed via "rel", file positioning using fseek and ftell are fully supported.

VSAM ESDS, KSDS and RRDS files are processed using a single RPL. Move mode is used to support spanned records. A VSAM file cannot be opened for write only (open mode "w") unless it was defined by Access Method Services to be a reusable file.

Choosing I/O Techniques and File Organization

Because of the wide variety of I/O techniques and 370 file types, it is not always easy to select the right combination for a particular application. Also, the considerations differ for new applications and for existing applications ported from another environment.

New Applications

Recommendations for I/O in new programs depend on whether the program needs to run on other systems. For portable applications, the following guidelines are recommended:

If OpenEdition is available on your system, use OpenEdition files wherever appropriate. Because OpenEdition files implement UNIX semantics, I/O to these files is more portable than I/O to traditional MVS files.
Use standard I/O rather than UNIX style I/O. Because standard I/O is endorsed by the ISO/ANSI standard, it is more portable than UNIX style I/O. It is also more efficient than UNIX style I/O on 370 systems, and it is not appreciably slower on most other systems.
Open a file for text access if the file will be processed as a series of lines. Open it for binary access if it will be processed as a series of characters.
If file positioning is required for text applications, use the fseek and ftell functions, which are more widely available than fsetpos or fgetpos. Note that you cannot use file positions arithmetically with these functions. (You may be forced to use fsetpos and fgetpos rather than fseek and ftell if you need to support very large files.)
If file positioning is required for binary applications, use UNIX style I/O, use fsetpos and fgetpos, or restrict the application to using files suitable for "rel" access. The advantage of UNIX style I/O is that it is applicable to most files and is somewhat portable. The advantage of fsetpos and fgetpos is that they are defined by the C Standard, so they are portable. The advantage of restricting the application to files suitable for "rel" access is that maximum efficiency is achieved with the most portable interface. If the file is only used by C programs (for example, if the file is a work file, or if it is not accessed by the outside world), then requiring suitable file attributes is clearly the best solution.

If your application does not have to be portable, there are several additional possibilities. Note, however, that even if portability is not a requirement, one of the portable techniques described earlier may still be most appropriate.

If your application needs to process data one record at a time, consider using the augmented standard I/O routines afread and afwrite. These routines are especially useful if the records may contain control characters (which makes standard I/O text access inappropriate).
For nonportable applications, use fsetpos and fgetpos rather than fseek and ftell for file positioning. These routines have fewer restrictions and their results are more easily interpreted.
If efficiency is a major consideration, you may want to use low-level I/O, as described in Chapter 2, "CMS Low-Level I/O Functions," and Chapter 3, "MVS Low-Level I/O Functions," in SAS/C Library Reference, Volume 2 .
Avoid UNIX style I/O.

Existing Applications

For existing applications, the choices are more difficult. With an existing program, you may be forced to choose between rewriting the program's I/O routines, accepting poor performance, and restricting use of the program to certain types of files. The following is a partial list of the considerations:

If the program uses standard I/O and processes a file as text, changes are required if the file position must be a relative character number. Changes may also be required if the program reads or writes control characters, or is sensitive to the presence or absence of trailing blanks.
If the program uses standard I/O, processes a file as binary, and uses fseek and ftell for file positioning, you must modify the program or restrict it to use only the "rel" access method. (Such an application could be modified to use UNIX style I/O or to use fsetpos and fgetpos for positioning.) Further modifications or restrictions on file type may be required if the program cannot tolerate the addition of trailing nulls at end of file.
If a program uses standard I/O to modify a file and requires that later data in the file be preserved, the program must be restricted to certain file types (for example, VSAM or CMS-format disk files) or be modified to use UNIX style I/O.
A program that uses standard I/O to append data to an existing file cannot update an MVS PDS member. Such a program must be restricted to use of files with sequential organization or (provided binary access is used) converted to use UNIX style I/O.
If the program uses UNIX style I/O and processes a file as binary, it usually executes without modification. Performance is improved if the file is suitable for "rel" access, because then the file can be processed without copying. If the program does not depend on some of the details of UNIX style I/O (for instance, if it is not sensitive to the exact nature of file positions), it can be converted to use standard I/O for better performance.
If the program uses UNIX style I/O, processes a file as text, and uses lseek to do file positioning, it requires substantial modification. The library does not support file positioning by character number when UNIX style I/O is used to access a file as text.
If a program using either I/O package treats a file sometimes as text and sometimes as binary (that is, it interprets a new-line character as both a line separator and a physical character), the program must be modified.

Technical Summaries

This section provides detailed discussions of many of the components of C I/O, such as opening files, file positioning, and using the standard input, output, and error files. There are also sections that address I/O under OpenEdition, advanced MVS and CMS I/O facilities, and how to perform VSAM keyed I/O in C. The last section attempts to answer some of the most commonly asked I/O questions.

Standard I/O Overview

The standard I/O package provides a wide variety of functions to perform input, output, and associated tasks. It includes both standard functions and augmented functions to support 370-oriented features.

In general, a program that uses standard I/O accesses a file in the following steps:

Open the file using the standard function fopen or the augmented function afopen. This establishes a connection between the program and the external file. The name of the file to open is passed as an argument to fopen or afopen. The fopen and afopen functions return a pointer to an object of type FILE. (This type is defined in the header file <stdio.h>, which should be included with a #include statement by any program that uses standard I/O.) The data addressed by this FILE pointer are used to control all further program access to the file.
Transfer data to and from the file using any of the functions listed in this section. The FILE pointer returned by fopen is passed to the other functions to identify the file to be processed.
Close the file. After the file is closed, all changes have been written to the file and the FILE pointer is no longer valid. When a program terminates (except as the result of an ABEND), all files that have not been closed by the program are closed automatically by the library.

For convenience, three standard files are opened before program execution, accessible with the FILE pointers stdin, stdout, and stderr. These identify the standard input stream, standard output stream, and standard error stream, respectively. For TSO or CMS programs, these FILE objects normally identify the terminal, but they can be redirected to other files by use of command-line options. For programs running under the OpenEdition shell, these FILE objects reference the standard files for the program that invoked them. More information on the standard streams is available later in this section.

Standard I/O functions may be grouped into several categories. The functions in each category and their purposes are listed in Table 3.1.

Table 3.1 Standard I/O Functions

   Function                    Purpose

   Control Functions            control basic access to
                                files

   fopen +                      opens a file

   afopen *+                    opens a file with
                                system-dependent options

   freopen +                    reopens a file

   afreopen *+                  reopens a file with
                                system-dependent options

   tmpfile                      creates and opens a
                                temporary file

   tmpnam                       generates a unique filename

   fflush                       writes any buffered output
                                data

   afflush +                    forces any buffered output
                                data to be written
                                immediately

   fclose +                     closes a file

   setbuf +                     changes stream buffering

   setvbuf +                    changes stream buffering

   Character I/O Functions      read or write single
                                characters

   fgetc                        reads a character

   getc                         reads a character (macro
                                version)

   ungetc                       pushes back a previously
                                read character

   getchar                      reads a character from stdin

   fputc                        writes a character

   putc                         writes a character (macro
                                version)

   putchar                      writes a character to stdout

   String I/O Functions         read or write character
                                strings

   fgets                        reads a line into a string

   gets                         reads a line from stdin into
                                a string

   fputs                        writes a string

   puts                         writes a line to stdout

   Array I/O Functions          read or write arrays or
                                objects of any data type

   fread                        reads one or more data
                                elements

   fwrite                       writes one or more data
                                elements

   Record I/O Functions         read or write entire records

   afread *                     reads a record

   afread0 *                    reads a record (possibly
                                length 0)

   afreadh *                    reads the initial part of a
                                record

   afwrite *                    writes a record

   afwrite0 *                   writes a record (possibly
                                length 0)

   afwriteh *                   writes the initial part of a
                                record

   Formatted I/O Functions      easily read or write
                                formatted data

   fprintf                      writes one or more formatted
                                items

   printf                       writes one or more formatted
                                items to stdout

   sprintf                      formats items into a string

   snprintf *                   formats items into a string
                                (with maximum length)

   fscanf                       reads one or more formatted
                                items

   scanf                        reads one or more formatted
                                items from stdin

   sscanf                       obtains formatted data from
                                a string

   vfprintf                     writes formatted data to a
                                file

   vprintf                      writes formatted data to
                                standard output stream

   vsprintf                     writes formatted data to a
                                string

   vsnprintf                    writes formatted data to a
                                string (with maximum length)

   File Positioning Functions   interrogate and change the
                                file position

   fseek                        positions a file

   fsetpos                      positions a file

   rewind                       positions a file to the
                                first byte

   ftell                        returns current file
                                position for fseek

   fgetpos                      returns current file
                                position for fsetpos

   Keyed Access Functions       read, write and position a
                                keyed stream

   kdelete *+                   delete a record from a keyed
                                file

   kgetpos *+                   return position of current
                                keyed file record

   kinsert *+                   add a new record to a keyed
                                file

   kreplace *+                  replace a new record in a
                                keyed file

   kretrv *+                    retrieve a record from a
                                keyed file

   ksearch *+                   search for a record in a
                                keyed file

   kseek *+                     reposition a keyed file

   ktell *+                     return RBA of current record
                                of keyed file

   Error-Handling Functions     test for and continue
                                execution after I/O errors
                                and other I/O conditions

   feof +                       tests for end of file

   ferror +                     tests for error

   clearerr +                   resets previous error
                                condition

   clrerr *+                    resets previous error
                                condition

   File Inquiry Functions       obtain information about an
                                open file at execution time

   fattr *+                     returns file attributes

   fileno *                     returns file number

   ffixed *+                    tests whether a file has
                                fixed length records

   fnm *+                       returns the name of a file

   fterm *+                     tests whether a file is the
                                user's terminal

*These functions are not defined in the ANSI standard. Programs that use them should include lcio.h rather than stdio.h.
+These functions may be used with files opened for keyed access.

UNIX Style I/O Overview

The UNIX style I/O package is designed to be compatible with UNIX low-level I/O, as described in previous sections. When you use UNIX style I/O, your program still performs the same three steps (open, access, and close) as those performed for standard I/O, but there are some important distinctions.

To open a file using UNIX style I/O, you call open or aopen. (aopen is not compatible with UNIX operating systems but permits the program to specify 370-dependent file processing options.) The name of the file to open is passed as an argument to open or aopen.
open and aopen return an integer called the file number (sometimes file descriptor). The file number is passed to the other UNIX style functions to identify the file. It indexes a table containing information used to access all files accessed with UNIX style I/O. Be sure to use the right kind of object to identify a file: a FILE pointer with standard I/O, but an integer file number with UNIX style I/O.

By convention, UNIX assigns the file numbers 0, 1, and 2 to the standard input, output, and error streams. Some programs use UNIX style I/O with these file numbers in place of standard I/O to stdin, stdout, and stderr, but this practice is nonportable. The library attempts to honor this kind of usage in simple cases, but for the best results the use of standard I/O is recommended.

UNIX style I/O offers fewer functions than standard I/O. No formatted I/O functions or error-handling functions are provided. In general, programs that require elaborately formatted output or control of error processing should, where possible, use standard I/O rather than UNIX style I/O. Some UNIX style I/O functions, such as fcntl and ftruncate are supported only for files in the OpenEdition hierarchical file system.

The functions supported by UNIX style I/O and their purposes are listed in Table 3.2. Note that the aopen function is not defined by UNIX operating systems. Also note that some POSIX-defined functions, such as ftruncate, are not implemented by all versions of UNIX.

Table 3.2 UNIX Style I/O Functions

   Function                    Purpose

   Control Functions

   aopen *                      opens a file with
                                system-dependent options

   close +                      closes a file

   creat                        creates a file and opens it
                                for output

   dup -                        duplicates a file descriptor

   dup2 -                       duplicates a file descriptor

   fcntl +-                     controls file options and
                                attributes

   fdopen +-                    associates a file descriptor
                                with a FILE pointer

   fsync                        forces output to be written
                                to disk

   ftruncate -                  truncates a file

   mkfifo -                     creates an OpenEdition FIFO
                                special file

   mknod -                      creates an OpenEdition
                                special file

   open -                       opens a file for UNIX style I/O

   pipe -                       creates an OpenEdition pipe

   Character I/O Functions

   read +                       reads characters from a file

   write +                      writes characters to a file

   File Positioning Functions

   lseek                        positions a file

   File Inquiry Functions

   ctermid                      returns the terminal
                                filename

   isatty +                     tests whether a file is the
                                user's terminal

   ttyname                      returns the name of the
                                terminal associated with a
                                file descriptor

*This function is not defined by UNIX operating systems.
+This function is also supported for sockets.
-This function is supported only for OpenEdition files.

Opening Files

Although there are several different functions that you can call to open a file (for example, fopen, afopen, and open), they all have certain points of similarity. The filename (or pathname) and an open mode are passed as arguments to each of these functions. The filename identifies the file to be opened. The open mode defines how the file will be processed. For example, the function call fopen("input", "rb") opens the file whose name is input for binary read-only processing.

Some of the open functions enable the caller to specify a C library access method name, such as "rel", and access method parameters (or amparms) such as "recfm=v". Access method parameters allow the program to specify system or access-method-dependent information such as file characteristics (for example, record format) and processing options (for example, the number of lines per page). The details for each of these specifications are described in this section.

General filename specification

The general form of a SAS/C filename is [//] style : name , where the portion of the name before the colon defines the filename style, and the portion after the colon is the name proper. For example, the style ddn indicates that the filename is a DDname, while the style cms indicates that the filename is a CMS fileid or device name.

Note: The // before the rest of the pathname is optional, except for programs compiled with the posix option. See OpenEdition I/O Considerations for a discussion of filenames for posix-compiled programs.

The style : part of the filename is optional. If no style is specified, the style is chosen as follows:

If the pathname begins with a // prefix, the default style is tso in MVS, or cms in CMS.
If you define the external variable _style with an initial value, then that value is used as the style. (See Chapter 9 in the SAS/C Compiler and Library User's Guide, Fourth Edition for more information on the external variable _style.) For instance, if the initial value of _style is tso, then the filename XYZ.DATA is interpreted as tso:xyz.data.
If no initial value for _style is defined, the default style is ddn under MVS, and cms under CMS. This means that by default, filenames are interpreted as DDnames under MVS and as fileids or device names under CMS.

As an aid to migration of programs between MVS and CMS, filenames oriented toward one system, such as cms:user maclib and tso:output.list, are accepted by the other system when a clearly equivalent filename can be established. (See the next section for details.)

The rules just described apply only to programs that are not compiled with the posix compiler option. For posix-compiled programs, all pathnames are treated as hierarchical file system names, unless they are preceded by the // prefix, even if they appear to contain a style prefix.

Filename specification under MVS

The library supports four primary styles of filenames under MVS: ddn, dsn, tso, and hfs. A ddn-style filename is a DDname, a dsn-style filename is a data set name in JCL syntax, a tso-style filename is a data set name in TSO syntax, and an hfs-style filename is a pathname referencing the OpenEdition hierarchical file system.

ddn-style filenames A filename in ddn style is a valid DDname, possibly preceded by leading white space. The filename can be in uppercase or lowercase letters, although it is translated to uppercase letters during processing. The following alternate forms are also allowed, permitting access to PDS members and to the TSO terminal:

 *
 *ddname
 ddname*
 ddname( member-name )

A ddn-style filename of * always references the user's TSO terminal. If you use this filename in a batch job, it references the SYSTERM DDname, if the file is being opened for output or append. (See Open modes for more information.) The filename * is not supported for input in batch. For a program executing under the OpenEdition shell, a ddn-style filename of * is interpreted as referencing /dev/tty in the hierarchical file system.

A ddn style filename of *ddname references the terminal for a TSO session or the DDname indicated for a batch job. For example, the filename *sysin requests use of the terminal under TSO or of the DDname SYSIN in batch.

A ddn-style filename of ddname* references the indicated DDname, if that DDname is defined. If the DDname is not defined, ddname* references the TSO terminal or SYSTERM in batch (for an open for output or append). For example, the filename LOG* requests the use of the DDname LOG, if defined, and otherwise, the user's TSO terminal.

A ddn-style filename of ddname( member-name ) references a member of the PDS identified by the DDname. For example, the filename SYSLIB(fcntl) requests the member FCNTL of the data set whose DDname is SYSLIB. If the DD statement also specifies a member name, the member name specified by the program overrides it.

With the availability of OpenEdition, another ddn-style filename is possible:

 ddname/filename

Here, ddname is a valid MVS filename, and filename is a valid POSIX filename (not containing any slashes).

For more information on this form, see Using HFS directories and PDS members interchangeably .

Note: Programs invoked via the OpenEdition exec system call do not ordinarily have access to DD statements. A SAS/C extension allows environment variables to be used in place of DD statements, as described in OpenEdition I/O Considerations .

dsn-stylefilenames A filename in dsn style is a valid, fully qualified data set name (possibly including a member name), optionally preceded by leading white space. The data set name can be in uppercase or lowercase letters, although it is translated to uppercase letters during processing. The data set name must be completely specified; that is, there is no attempt to prefix the name with a userid, even for programs running under TSO. (Programs that want to have the prefix added should use the tso filename style.) For more information on data set names and their syntax, consult the IBM manual MVS/ESA JCL Reference.

The following alternate forms for dsn-style names are also allowed, permitting access to temporary data sets, the TSO terminal, DUMMY files, and SYSOUT files:

 *
 nullfile
 sysout= class tmpname

A dsn-style filename of * always references the user's TSO terminal. If this filename is used in a batch job, it references the SYSTERM DDname, if the file is being opened for output or append. (See Open modes .) The filename * is not supported for input in batch. For a program running under the OpenEdition shell, * is interpreted as referencing /dev/tty.

A dsn-style filename of nullfile references a DUMMY (null) data set. Reading a DUMMY data set produces an immediate end of file; data written to a DUMMY data set are discarded.

A dsn-style filename of sysout= class references a SYSOUT (printer or punch) data set of the class specified. The class must be a single alphabetic or numeric character, or an asterisk.

A dsn-style filename of & tmpname references a new temporary data set, whose name is & tmpname . The name is limited to eight characters.

tso-style filenames A filename in tso style is a data set name (possibly including a member name) specified according to TSO conventions, optionally preceded by leading white space. The data set name can be in uppercase or lowercase letters, although it is translated to uppercase letters during processing. If the data set name is not enclosed in single quotation marks, the name is prefixed with the user's TSO prefix (normally the userid), as defined by the TSO PROFILE command. If the data set name is enclosed in single quotation marks, the quotes are removed and the result is interpreted as a fully qualified data set name. For more information on TSO data set names and their syntax, consult the IBM manual TSO Extensions User's Guide.

Note: tso-style filenames are not guaranteed to be available, except for programs executing under TSO. If you attempt to open a tso-style filename in MVS batch (or under the OpenEdition shell), the userid associated with the batch job is used as the data set prefix. Determining the userid generally requires RACF or some other security product to be installed on the system. If the userid cannot be determined, the open call will fail.

The following alternate forms for tso-style names are also allowed, permitting access to the TSO terminal and DUMMY files.

 *
 'nullfile'

A tso-style filename of * always references the user's TSO terminal. For programs running under the OpenEdition shell, it is interpreted as referencing the HFS file called /dev/tty.

A tso-style filename of 'nullfile' references a DUMMY data set. Reading a DUMMY data set produces an immediate end of file; data written to a DUMMY data set are discarded.

cms-style filenames For compatibility with CMS, the MVS version of the SAS/C library accepts cms-style filenames, where possible, by transforming them into equivalent tso-style filenames. See the next section, "Filename specification under CMS," for details on the format of cms style filenames.

A cms-style filename is transformed into a tso-style filename by replacing spaces between the filename components with periods, removing the MEMBER keyword, and adding a closing parenthesis after the member name, if necessary. Also, the filenames cms: * and cms: are interpreted as tso: * and
tso: 'nullfile', respectively. For instance, the following transformations from cms-style to tso-style names are performed:

 cms: profile exec a1               tso: profile.exec.a1
 cms: lc370 maclib (member stdio    tso: lc370.maclib(stdio)
 cms: reader                        tso: reader
 cms: *                             tso: *
 cms:                               tso: 'nullfile'

hfs-style filenames A filename in hfs style is a pathname in the OpenEdition hierarchical file system. If the pathname begins with a /, the pathname is an absolute pathname, starting at the root directory. If the pathname does not begin with a /, it is interpreted relative to the current directory.

Note: You cannot open an HFS directory using fopen or open. The opendir function must be used for this purpose.

Filename specification under CMS

The library supports five primary styles of filename under CMS: cms, xed, ddn, sf, and sfd. A cms- or xed-style filename is a CMS fileid or device name. A ddn-style file is a DDname (FILEDEF or DLBL name). A sf-style filename is the name of a CMS shared file system file, and a sfd-style filename is a pattern defining a subset of a CMS shared file system directory.

The only difference between the cms and xed styles is that, if a program is running under XEDIT, use of the xed prefix allows reading of the file from XEDIT, rather than from disk.

cms- and xed-style filenames A filename in cms style is a CMS fileid or device name. You can specify fileids in one of two formats: a CMS standard format, or a compressed format. The compressed format contains no blanks, so it can be used in cases in which the presence of blanks is not allowed, such as in command-line redirections. The xed style permits a subset of the valid cms specifications, as described in Advanced CMS I/O Facilities . Here is the standard form for a cms-style filename:

 filename  [ filetype [ filemode ]] [(MEMber member-name )]]

The brackets indicate optional components. The filename may be preceded by white space and can be in uppercase or lowercase letters, although it is translated to uppercase letters during processing. Detailed rules for this style of filename are as follows:

If no filetype is specified, the filetype FILE is assumed, unless the MEMBER keyword is present, in which case, the filetype MACLIB is assumed.
If filemode is omitted or is specified as *, a search is made for an existing file on an accessed disk using the standard CMS search order. If no existing file can be located, and the open mode permits output, a filemode of A1 is assumed.
You can specify a member-name only for files whose filetype is MACLIB or TXTLIB, opened for input. The keyword MEMber may be abbreviated to MEM. (The xed style does not allow a member name to be specified.)

Here is the compressed form for a cms-style filename:

 filename [. filetype [. filemode ]] [( member-name )]

This form of filename is interpreted in exactly the same way as the corresponding standard name. For example, cms: freds maclib (mem smith) and cms: freds.maclib(smith) are equivalent specifications. For more information on CMS fileids, consult the IBM CMS manuals listed in Chapter 1, "Introduction," in the SAS/C Compiler and Library User's Guide, Fourth Edition .

The following alternate forms for cms-style names are also allowed, permitting access to unit record devices and members of GLOBAL MACLIBs or TXTLIBs. (See Advanced CMS I/O Facilities for a description of access to GLOBAL MACLIBs and TXTLIBs.) After each form, valid abbreviations are given. (None of these forms can be used with the xed style.)

    Alternate Forms                 Abbreviations

    TERMINAL                        TERM, *

    READER                          RDR

    PRINTER                         PRT

    PUNCH                           PUN, PCH

    %MACLIB(member member-name )   %MACLIB( member-name )

    %TXTLIB(member member-name )   %TXTLIB( member-name )

Also, an empty filename ("") may be used to open a dummy file.

Note: To open a CMS disk file whose filename is the same as one of the above device names, you must specify both the filename and the filetype.

ddn-style filenames A filename in ddn style is a valid DDname, possibly preceded by white space. The filename can be in uppercase or lowercase letters, although it is translated to uppercase letters during processing. The DDname must be previously defined using the FILEDEF command (or the DLBL command for a VSAM file). The following alternate forms are also allowed, permitting access to members of MACLIBs, TXTLIBs, and MVS PDSs, and to the CMS terminal. (All forms have approximately the same meaning as under MVS.) For more information, see Filename specification under MVS .

 *
 *ddname
 ddname*
 ddname( member-name )

A ddn-style filename of * always references the user's CMS terminal.

A ddn-style filename of *ddname also references the terminal. (The DDname is never used because the terminal is always defined under CMS.)

A ddn-style filename of ddname* references the indicated DDname, if that DDname is defined. If the DDname is not defined, it references the CMS terminal. For example, the filename LOG* requests the use of the DDname LOG, if defined, and otherwise, the user's terminal.

A ddn-style filename of ddname( member-name ) references a member of the MACLIB, TXTLIB, or MVS PDS identified by the DDname. For example, the filename SYSLIB(fcntl) requests the member FCNTL of the file whose DDname is SYSLIB. If the FILEDEF command also specifies a member name, the member name specified by the program overrides it.

tso-style filenames For compatibility with MVS, the CMS version of the library accepts tso-style filenames where possible, by transforming them into equivalent cms-style filenames. See Filename specification under MVS for details on the format of such filenames.

A tso-style filename is transformed into a cms-style filename by removing single quotation marks, if present, and treating the resulting name as a compressed format fileid. (The result must be a valid CMS fileid or the open fails.) In addition, the specification tso: * is interpreted as cms: terminal. For instance, the following transformations from tso-style to cms-style names are performed:

 tso: input.data                    cms: input data
 tso: parser.source.c               cms: parser source c
 tso: 'sys1.maclib(dcb)'            cms: sys1 maclib (member dcb
 tso: *                             cms: terminal

sf-style filenames A sf-style filename references a file in the CMS shared file system. See Using the CMS Shared File System for detailed information on the syntax of sf-style filenames.

sfd-style filenames A sfd-style filename references a CMS shared file system directory or directory subset. See Using the CMS Shared File System for detailed information of the syntax of sfd-style filenames.

Open modes

The second argument to each open routine is an open mode, which defines how the file will be processed. This argument is specified differently, depending on whether you are using standard I/O or UNIX style I/O, but the basic capabilities are the same.

Standard I/O open modes When you open a file using standard I/O, the open mode is a character string consisting of one to three enquoted characters. The syntax for this string is as follows:

 r|w|a[+][b|k]

The first character must be 'r', 'w', or 'a'. After the first character, a '+' may appear, and after the '+' (or after the first character, if '+' is omitted), 'b' or 'k' may appear. No blanks may appear in the string, and all characters must be lowercase letters.

The 'r|w|a' character specifies whether the file is to be opened for reading, writing, or appending. If a '+' appears, both reading and writing are permitted.

If a 'b' appears, the file is accessed as a binary stream. If a 'k' appears, the file is accessed as a keyed stream. If neither 'b' nor 'k' appears, the file is accessed as text. See Text access and binary access for detailed information on the differences between text and binary access. See Using VSAM Files for information on keyed access.

The effect of the 'r|w|a' specification and the '+' are closely linked and must be explained together.

A file opened with open mode 'r' or 'rb' is a read-only file. The file must already exist. (See File existence .)
A file opened with open mode 'rk' is a read-only file suitable for keyed access. Records can be retrieved, but not replaced, deleted, or inserted. If the file has not been loaded, the open will fail.
A file opened with open mode 'r+' or 'r+b' can be both read and written. The file must already exist. (See File existence .)
A file opened with open mode 'r+k' can be read, or written using keyed access, or both. All file operations are permitted. If the file has not been loaded, the open will fail.
A file opened with open mode 'w' or 'wb' is a write-only file. If the file already exists, its previous contents are discarded.
A file opened with open mode 'wk' is a write-only file suitable for keyed access. If the file contains any records, they are erased. If the file is not defined as REUSABLE and it contains any records, the open will fail. This open mode enables records to be added to the file, but not to be retrieved, updated, or deleted.
A file opened with open mode 'w+' or 'w+b' can be both read and written. If the file already exists, its contents are discarded when it is opened.
A file opened with open mode 'w+k' can be read, or written using keyed access, or both. If the file contains any records, they are erased. If the file is not defined as REUSABLE and it contains any records, the open will fail. All file operations are permitted.
A file opened with open mode 'a' or 'ab' can only be written. If the file exists, its contents are preserved. All output is appended to the end of file.
A file opened with open mode 'ak' is a write-only file suitable for keyed access. Records can be inserted, but not retrieved, replaced, or deleted. (Records can be inserted at any point in the file, not just at the end.) The file does not have to be loaded in advance.
A file opened with open mode 'a+' or 'a+b' can be both read and written. If the file exists, its contents are preserved. Whenever an output request is made, the file is positioned to the end of file first; however, reading may be performed at any file position. The file is initially positioned to the start of the file.
A file opened with open mode 'a+k' can be read and/or written using keyed access. Records can be retrieved or inserted, but not replaced or deleted. (Records can be inserted at any point in the file, not just at the end.) The file does not have to be loaded in advance.

Note: For compatibility with some PC C libraries, certain variant forms of the open mode parameter are accepted. The order of the '+' and the 'b' may be reversed, and an 'a' may appear in place of the 'b' to request that the file be accessed as text.

UNIX style I/O open modes When you open a file using UNIX style I/O, the open mode is an integer, with open mode options indicated by the presence or absence of particular bit settings. The open mode is normally specified by ORing symbolic constants that specify the options required. For instance, the specification O_RDONLY|O_BINARY is used for a read-only file to be accessed as a binary stream. The symbolic constants listed here are all defined in the header file <fcntl.h>.

The following open mode options are supported by UNIX style I/O:

O_RDONLY: specifies that the file will be read but not written. If you do not specify O_WRONLY or O_RDWR, O_RDONLY is assumed.
O_WRONLY: specifies that the file will be written but not read.
O_RDWR: specifies that the file will be both read and written.
O_APPEND: specifies that the file will be positioned to the end of file before each output operation.
O_CREAT: specifies that if the file does not exist, it is to be created. (See File existence .) If O_CREAT is omitted, an attempt to open a file that does not exist fails.
O_TRUNC: specifies that if the file exists, the file's current contents will be discarded when the file is opened.
O_EXCL: is meaningful only if O_CREAT is also set. It excludes the use of an already existing file.
O_NONBLOCK: specifies the use of non-blocking I/O. This option is meaningful only for OpenEdition HFS files.
O_NOCTTY: specifies that the file is not to be treated as a "controlling terminal." This option is meaningful only for OpenEdition HFS files.
O_BINARY: specifies that the file be accessed as a binary stream. If O_TEXT is not specified, O_BINARY is assumed. (The synonym O_RAW is supported for compatibility with other compilers.)
O_TEXT: specifies that the file be accessed as a text stream.

Note: UNIX I/O does not support keyed streams.

Table 3.3 defines equivalent forms for standard I/O and UNIX style I/O open modes. Some UNIX style I/O open modes have no standard I/O equivalents.

Table 3.3 Standard I/O and UNIX Style I/O Open Modes

   
 Standard form   UNIX style form

   'r'              O_RDONLY | O_TEXT

   'rb'             O_RDONLY

   'r+'             O_RDWR | O_TEXT

   'r+b'            O_RDWR

   'w'              O_WRONLY | O_CREAT | O_TRUNC | O_TEXT

   'wb'             O_WRONLY | O_CREAT | O_TRUNC

   'w+'             O_RDWR | O_CREAT | O_TRUNC | O_TEXT

   'w+b'            O_RDWR | O_CREAT | O_TRUNC

   'a'              O_WRONLY | O_APPEND | O_CREAT | O_TEXT

   'ab'             O_WRONLY | O_APPEND | O_CREAT

   'a+'             O_RDWR | O_APPEND | O_CREAT | O_TEXT

   'a+b'            O_RDWR | O_APPEND | O_CREAT

Library access method selection

When you use afopen or afreopen to open a file, you can specify the library access method to be used. If you use some other open routine, or specify the null string as the access method name, the library selects the most appropriate access method for you. If you specify an access method that is incompatible with the attributes of the file being opened, the open fails, and a diagnostic message is produced. Six possible access method specifications are available:

A null ("") access method name allows the library to select an access method.
The "term" access method applies only to terminal files.
The "seq" access method is primarily oriented towards sequential access. ("seq" may also be specified for terminal files, in which case, the "term" access method is automatically substituted.)
The "rel" access method is primarily oriented toward access by relative character number. The "rel" access method can be used only when the open mode specifies binary access. Additionally, the external file must have appropriate attributes, as discussed in 370 Perspectives on SAS/C Library I/O .
The "kvs" access method provides keyed access to VSAM files.
The "fd" access method provides access to OpenEdition hierarchical file system files.

When no specific access method is requested by the program, the library selects an access method as follows:

"term" for a TSO or CMS terminal file
"kvs" if the open mode specifies keyed access
"fd" for a hierachical file system file
"rel" if the open mode includes binary access and the file has suitable attributes
"seq" otherwise.

Access method parameters

When you use afopen, afreopen, or aopen to open a file, you can optionally specify one or more access method parameters (or amparms). These are system-dependent options that supply information about how the file will be processed or allocated.

The amparms are specified as character strings containing one or more specifications of the form amparm=value , separated by commas (for example, "recfm=v,reclen=100"). You can specify the amparms in any order and in uppercase or lowercase letters. (However, the case of the value for the eof and prompt amparms is significant.)

There are two sorts of amparms: those that describe how the file will be processed and those that specify how an MVS file will be created when the filename is specified in dsn or tso style. All amparms are accepted under both MVS and CMS, but their exact interpretation and their defaults differ from system to system, as described in the following section. Inapplicable amparms are ignored rather than rejected whenever reasonable.

The function descriptions for afopen, afreopen, and aopen provide examples of typical amparm usage.

File processing amparms The file processing amparms may be classified into the following four categories:

File Characteristics

recfm=f/v/u: operating system record format
reclen=nnn|x: operating system record length
blksize=nnn: operating system block size
keylen=nnn: VSAM key length requirement
keyoff=nnn: VSAM key offset requirement
org=value: file organization requirement

File Usage

print=yes|no: file destined to be printed
page=nnn: maximum lines per page (with print=yes)
pad=no|null|blank: file padding permitted
trunc=yes|no: effect of output before end of file
grow=yes|no: controls whether new data can be added to a file
order=seq|random: specifies whether records for a file are normally processed in sequential or random order
commit=yes|no: specifies whether modifications to a file should be committed when the file is closed
dirsrch=value: used when opening a CMS Shared File System directory to specify the information to be retrieved from the directory

Terminal Options

eof=string: end-of-file string
prompt=string: terminal input prompt

VSAM Performance Options

bufnd=nnn: number of data I/O buffers VSAM is to use
bufni=nnn: number of index I/O buffers VSAM is to use
bufsp=nnn: maximum number of bytes of storage to be used by VSAM for file data and index I/O buffers
bufsize=nnn: size, in bytes, of a DIV window for a linear data set
bufmax=n: number of DIV windows for a linear data set

See Terminal I/O for a discussion of the eof and prompt amparms. See VSAM-related amparms for a discussion of the VSAM Performance amparms.

The default amparms vary greatly between MVS and CMS, so they are described separately for each system.

File characteristics amparms The recfm, reclen, blksize, keylen, keyoff, and org keywords specify the program's expectations for record format, maximum record length, block size, key length, key offset, and file organization. If the file is not compatible with the program's recfm, reclen, or blksize specifications, it is still opened, but a warning message is directed to the standard error file. If the file is not compatible with the program's keylen, keyoff, or org specifications, a diagnostic message is produced, and the open fails.

If the file is being opened for output and the previous file contents are discarded, the file will, if possible, be redefined to match the program's specifications, even if these are incompatible with the previous file attributes. This is not done if any of the file's contents are to be preserved, because changing the file characteristics may make this data unreadable. (One effect of this is that the characteristics of an MVS partitioned data set are never changed, because even if one member is completely rewritten, other members continue to exist.)

The effects of these amparms are sometimes different from similar specifications on a DD statement, a TSO ALLOCATE command, or a CMS FILEDEF. JCL or command specifications always override any previously established file characteristics, but amparms override only if the library can determine that this is safe.

Details of the file characteristics amparms include the following:

The recfm amparm defines the file's expected record format. recfm=f indicates fixed length records; recfm=v and recfm=u indicate varying length records. Under MVS, recfm=v and recfm=u request the DCB attributes RECFM=V and RECFM=U, respectively. Under CMS, the two are equivalent, except when a filemode 4 or OS data set is processed.
VSAM files are always treated by the library as RECFM V, because they are never restricted by the system to a single record length.
The recfm amparm must be specified as exactly f, v, or u. The inclusion of other characters valid in a JCL specification (for example, recfm=vba) is not permitted.
The reclen amparm defines the maximum length record the program expects to read or write. The specification reclen=x (which is not permitted with recfm specifications other than v) indicates that there is no maximum record length.
Under MVS, the value of reclen might not be the same as the LRECL of the data set being opened. For RECFM=V data sets, the LRECL includes 4 bytes of control information, but the reclen value contains only the length of the data portion of a record. This allows a reclen specification to have the same meaning under MVS and CMS, despite the different definitions of LRECL in the two systems.
Under MVS, a reclen=x output file is created with RECFM=VBS,LRECL=X, which allows arbitrarily long records. Under CMS, a reclen=x output allows records up to 65,535 bytes, which is the maximum permitted by CMS.
For VSAM ESDS and RRDS files, the value of reclen must take into account the four-byte key field maintained by the library at the start of the records processed by the program. For example, if the maximum physical record for an ESDS data set is 400 bytes, then you should specify reclen=404 in the amparms.
The blksize amparm specifies the maximum block size for the file as defined by the operating system. Under MVS and CMS for filemode 4 files, this is equivalent to the DCB BLKSIZE parameter. (Thus, for files with record format V, the V-format control bytes are included in the blksize value.)
For an OpenEdition HFS file, the blksize amparm controls the size of the buffer used by the library to access the file. For these files, this is the only effect of specifying blksize.
If a CMS disk file is opened for read-only with the "seq" access method and RECFM=F, the blksize amparm specifies the library's internal buffer size. If the buffer size is larger than the LRECL of the file, each input operation performed by the library reads as many records as will fit into the buffer.
When the "rel" access method is used to open CMS files, the library transfers data in the units specified. (For example, if you specify blksize=10000, the library reads or writes data 10,000 characters at a time.) Under either MVS or CMS, a large blksize specification improves performance at the cost of additional memory for buffers.
The blksize amparm under MVS is also used during allocation of a new data set specified with a dsn- or tso-style filename.
The keylen amparm specifies the length of the key field for a file accessed as keyed. You can also specify keylen=0 for a file that is not accessed as keyed. For ESDS or RRDS files, if you specify keylen, the length must be 4. If the specified length and the actual length do not agree, the open will fail.
If you open a KSDS that does not already exist, you must specify the keylen amparm to correctly create the file. If you access a VSAM file through standard I/O (that is, using text or binary access), keylen must either not be specified or be specified as 0.
The keyoff amparm specifies the offset of the key field in the record for a file accessed as keyed. If keylen=0 is specified, any keyoff= specification is ignored. For ESDS or RRDS data sets, you must specify the offset as 0. If you open a VSAM data set that does not already exist and no keyoff is specified, then keyoff=0 is assumed.
The org amparm enables the program to specify a requirement for a particular file organization. For an existing file, the library validates that the file requested has the correct organization. For a new file, the library creates the file with the requested organization, if possible.
The following values are permitted for the org amparm:

ps
is the value specified if the file is an ordinary sequential file, such as an MVS sequential data set, a CMS disk file, a tape file, or a CMS spool file. To read the directory, specify the value ps for a file that is a PDS.
os
is the value specified if the file is an OS format file under CMS, such as a filemode 4 file or a file on an OS disk.
po
is the value specified if the file is a partitioned data set, or a CMS MACLIB or TXTLIB. Under systems supporting PDSEs, the file can be either a regular PDS or a PDSE.
pds
is the value specified if the file is a regular (non-PDSE) PDS.
pdse
is the value specified if the file is a PDSE.
ks
is the value specified if the file is a VSAM KSDS.
es
is the value specified if the file is a VSAM ESDS.
rr
is the value specified if the file is a VSAM RRDS.
ls
is the value specified if the file is a VSAM linear data set.
byte
is the value specified if the file is an OpenEdition HFS file.

Certain org values are treated as equivalents in some systems to permit programs to be ported from one environment to another. Notably, the org values pds and pdse are treated like the value po under systems not supporting PDSEs, and the values os and ps are treated synonymously under MVS.

File usage amparms File usage amparms allow the program to specify how a file will be used. A specification that cannot be honored may cause the open to fail, generate a warning message, or cause a failure later in execution, depending on the circumstances. The exact treatment of these amparms is highly system-dependent.

The amparm print=yes or print=no indicates whether the file is destined to be printed. If you specify print=yes, ANSI carriage control characters are written to the first column of each record of the file to effect page formatting, if the file format permits this. In your C program, you can write the '\f' character to go to a new page and the '\r' character to perform overprinting.
print=yes is allowed only for files that are accessed as a text stream and whose open mode is 'w' or 'a'. If these conditions are satisfied but the file characteristics do not support page formatting, a warning message is generated, and no page formatting occurs.
If you specify print=no, then the '\f' and '\r' characters in output data are treated as normal characters, even if the file characteristics will permit page formatting to occur.
The amparm page= nnn specifies the maximum number of lines that will be printed on a page. It is meaningful only for files opened with print=yes, or for which print=yes is the default. It is ignored if specified for any other file.
The amparm pad specifies how file padding is to be performed. pad=blank requests padding with blanks, pad=null requests padding with null characters, and pad=no requests that no padding be performed. If pad=no is specified, a record that requires padding is not written and a diagnostic message is generated.
The pad amparm is meaningful only for files with fixed-length records. For files accessed as text, pad characters are added as necessary to each output record and removed from the end of each input record. For output files accessed as binary, padding only applies to the last record, and for input files accessed as binary, padding is never performed.
When new data is written to a file before the end of file, the amparm trunc specifies whether existing records following the current file position are to be erased or preserved. trunc=yes specifies erasure; trunc=no specifies preservation. If the trunc specification cannot be honored, the open fails. The primary use for this parameter is to indicate a program dependency on truncation or nontruncation and thereby avoid inappropriate file updates.
You need to use the trunc amparm only when you use open modes 'r+', 'r+b', 'w', 'w+', or 'w+b' and one of the file positioning functions (fseek, rewind, or fsetpos) to position before the end of file. Do not specify the trunc amparm with UNIX style binary I/O; UNIX style binary I/O always exhibits trunc=no behavior. You can change the length of a record in a file only if it is the last record, or if trunc=yes is in effect.
The amparm grow=yes or grow=no _ controls whether new data can be added to a file. The default is always grow=yes, which permits the addition of new records. The no specification is only permitted when the open mode is 'r+', 'r+b', or 'r+k', and when trunc=yes is not specified. When grow=no is specified, attempts to add new records to the file will fail. For some file types, notably MVS PDS members, use of grow=no can lead to performance improvements. In particular, for PDSE members, the fseek and fsetpos functions are supported if you specify grow=no, but not with grow=yes.
The amparm order=seq or order=random specifies whether records for the file are normally processed in sequential or random order. This is specified as order=seq for sequential order, or order=random for random order. This amparm is meaningful for VSAM files with keyed access and for CMS shared files; correct specification can lead to performance improvements. For all other file types, this amparm is ignored. The default is determined by the access method. For VSAM, the default is order=random; for CMS shared files, the default is selected by CMS.
The amparm commit=yes or commit=no specifies whether modifications to the file should be committed when the file is closed. The no specification for the commit amparm is supported only for CMS Shared File System files, and this specification is rejected for any other file type. When commit=no is in effect, you must call the afflush function to commit updates to a shared file. If you close without calling afflush, the updates are rolled back, and the file is left unchanged.
When commit=no is specified in a call to aopen (using UNIX style I/O), the behavior is slightly different. See the fsync function description for a discussion of this case.
The dirsrch amparm opens a CMS Shared File System directory to specify the information to be retrieved from the directory.

Amparms - MVS details

This section discusses amparms under MVS. It provides an explanation of and defaults for each amparm.

File characteristics For input files, or output files in which some or all of a file's previous contents are preserved, the file characteristics amparms serve as advice to the library regarding the file characteristics expected. If the actual file does not match the program's assumptions, a warning message is generated. In some cases, no warning is generated, even though the file characteristics are not exactly what the program specified. For instance, if a program specifies amparms of "recfm=v,reclen=80" and opens an input file with LRECL 40, no diagnostic is generated, because all records of the input file are consistent with the program's assumption of input records with a maximum length of 80 bytes.

To determine the characteristics of a file, the library merges information from the amparms, control language specifications, and the data set label. Unlike amparms information, control language specifications always override the physical file characteristics recorded in the label.

For each of these amparms, processing is as follows:

recfm

If you specify recfm=f, the program expects records of equal length, and a warning is generated if the file does not have fixed-length records (blocked or unblocked). If recfm=v or recfm=u is specified for a read-only file, no diagnostic is ever generated. For a write-only or update file, a warning is generated if the MVS RECFM does not match the amparm.

Note: VSAM linear and RRDS files are always considered to have RECFM F, and other VSAM files and OpenEdition HFS files are considered to have RECFM V.

reclen

If you specify reclen= nnn for a read-only file, a warning is generated if the file's record size is larger, or if it is not equal and the record format is fixed. If reclen=x is specified for a read-only file, a diagnostic is never generated, except when the record format is fixed. Note that under MVS, the program's reclen specification is compared to the LRECL - 4 for a V-format file, not to the LRECL itself. (Additionally, for a file with carriage control characters, the control character is not counted.)

If you specify reclen= nnn for a write-only or update file, a warning is generated if the file's record size is not the same as the reclen specification. If you specify reclen=x, a warning is generated unless the file has RECFM=VBS or RECFM=VS and LRECL=X.

VSAM linear data sets are always considered to have reclen=4096.

blksize

If you specify blksize= nnn , a warning is generated only if the actual blksize is greater than that specified.

When a write-only or update file is opened and none of the file's previous contents are preserved, the file's characteristics are changed to correspond to the program's amparms specifications. The details of this process are outlined here:

recfm specifications of f, v, and u become MVS RECFM's of FB, VB, and U, respectively. However, if you select the "rel" access method, RECFM FBS is chosen. If the reclen is x or the blksize is less than the reclen with recfm=v, MVS RECFM VBS is requested. Finally, if you also specify print=yes, ANSI carriage control characters (RECFM=A) are added.
A reclen=x specification requests use of LRECL=X. reclen= nnn requests a LRECL of nnn , unless the record format includes V, in which case, it requests nnn +4.
A blksize= nnn specification requests an MVS BLKSIZE of nnn . If the requested BLKSIZE is not compatible with the chosen RECFM and LRECL, the BLKSIZE is rounded, if possible.

When a file is opened and neither the file nor any amparms fully specify the file characteristics, the following rules apply:

If the "rel" access method is in use, RECFM=F, FS, or FBS is required. The default record length is 1, and the default blksize is 4080.
For the "seq" access method, choices are made based on device type and the presence or absence of a print specification, as shown in Table 3.4.

Table 3.4 MVS Default File Characteristics Amparms

   Device Type           recfm   reclen   blksize

   Card punch             f       80       80

   Printer/SYSOUT/DUMMY   v       132      141

   Other (print=yes)      v       132      6144

   Other (print=no)       v       255      6144

File usage The amparm print has two uses: it specifies whether the corresponding file includes ANSI carriage control characters, and it specifies whether the C library should process the control characters '\r' and '\f' to effect page formatting. If print=no is specified, then '\r' and '\f' are simply treated as data characters, even if the file can support page formatting. If you specify print=yes, then the library attempts to perform page formatting. However, if the associated file does not have the correct attributes, '\r' and '\f' are treated as new lines, and a warning message is generated when the file is opened. If neither print=no nor print=yes is specified, the library chooses a default based on the attributes of the external file. However, print=yes is supported only for a file accessed as a text stream.

Under MVS, a file is considered to be suitable for page formatting if it has the following characteristics:

It is a SYSOUT or printer file.
Its RECFM includes the letter A, indicating support for ANSI control characters.

For files with RECFM A, space for the control character is included in the LRECL, but not in any reclen specification made by the program.

The page amparm, which specifies the number of lines to be printed on each page of a print=yes file, does not have a default. That is, if page is not specified, page ejects will occur only when a '\f' is written to the file.

Note: The print and page amparms are ignored when opening files in the OpenEdition hierarchical file system. Control characters are always written to an HFS file, regardless of whether you specified print.

The pad amparm specifies whether padding of records will take place, and, if so, whether a blank or a null character ('\0') will be used as the pad character. The default depends on the library access method and the open mode, as follows:

If the access method is "rel", the default is pad=null.
If the access method is "seq" and the file is accessed as text, the default is pad=blank.
If the access method is "seq" and the file is accessed as binary, the default is pad=no; that is, padding is not performed.

The trunc amparm indicates to the library whether the program requires that output before the end of file erase the following records or leave them unchanged. Under MVS, whether existing data are preserved in this situation depends only on file type and access method. If trunc is omitted, the value corresponding to the file's actual behavior is used; if a conflicting trunc specification is made, the file fails to open. If a file is processed by the "rel" access method, is a VSAM file, or is in the OpenEdition HFS, only trunc=no is supported. For all other MVS file types, only trunc=yes is supported.

The grow amparm indicates to the library whether new data can be added to a file opened for "r" or "r+". The specification grow=no informs the library that the program will only replace existing records of a file, rather than adding any data to the end. When you specify grow=no for a file processed with BSAM, the library can open it for UPDAT rather than OUTIN. This allows the library to support use of the fseek or fsetpos functions on a PDSE member. grow=no implies trunc=no.

File creation amparms These amparms are used under MVS with filenames specified in dsn or tso style when the file does not exist and must be created. These amparms are accepted under CMS, and for ddn-style names or existing files under MVS, but in these cases they are ignored.

Note: VSAM files can be created directly by a C program only in MVS, and only if the Storage Management Subsystem (SMS) is active. On CMS, or if SMS is not available, VSAM files must be created by the Access Method Services (AMS) utility before they can be accessed by a C program.

The file creation amparms are as follows:

alcunit=block|trk|cyl|rec|krec|mrec: unit of space allocation
space= nnn: primary amount of space to allocate
extend= nnn: secondary amount of space to allocate
dir= nnn: number of PDS directory blocks
vol=volser: requested volume serial number
unit=name: requested unit name.
rlse=yes|no: release unused file space when file is closed
dataclas=name: data class for a new file
storclas=name: storage class for a new file
mgmtclas=name: management class for a new file

The meanings of these amparms and their defaults are discussed in the following list. Default values are site-specific and may have been changed at the time the SAS/C library was installed. Consult your SAS Software Representative for C compiler products to determine the defaults at your site.

The alcunit amparm defines how the values specified for space and extend are to be interpreted. If you specify alcunit=block, the space and extend values are interpreted as the number of physical blocks to allocate. (If you specify alcunit=block, you must also specify the blksize amparm to define the size of the blocks.) Similarly, alcunit=trk specifies that the space and extend values are to be interpreted as the number of disk tracks, and alcunit=cyl specifies that they are to be interpreted as the number of disk cylinders.
If you use the rec specification, the space and extend amparms are expressed in numbers of records. The krec specification expresses these values in units of 1024 records, and the mrec specification expresses these values in units of 1,048,576 records. If you use one of these specifications, you must also specify the reclen amparm to define the record length. Use of these options is recommended only when the Storage Management Subsystem of MVS is installed and active. If SMS is not active and either rec, krec, or mrec is specified for the alcunit amparm, the library attempts to convert the specification to an equivalent specification in blocks, tracks, or cylinders. However, the conversion is at best approximate, and may allocate substantially more or less space than actually required.
The space amparm specifies the amount of disk space to be initially allocated to the file, in the units specified by alcunit. For instance, the amparm string "alcunit=block,blksize=5000,space=100" requests enough space to hold 100 blocks of 5000 characters each. If there is not enough disk space available, the open fails.
The extend amparm specifies the amount of additional disk space allocated to a file if the existing space runs out during processing. A file can be extended up to 15 times, after which any attempt to add more data to the file fails. As with space, the extend value is interpreted in the units specified by alcunit.
The amparms alcunit, space, and extend must be specified together. Specifying space without alcunit or extend without space will cause the open to fail without creating a file.
The amparm specification "alcunit=alc,space=spc,extend=ext" is equivalent to the JCL specification SPACE=(alc,(spc,ext)), where an "alc" of "block" is replaced by the value of the blksize amparm, and where an "alc" of "rec", "krec" or "mrec" is replaced by an appropriate AVGREC JCL parameter.
When a nonexisting data set specified in dsn or tso style is opened and no space specification is supplied by the program, a default amount of space is allocated. The standard default allocation is that specified by "alcunit=block,blksize=1000, space=10,extend=3".
The dir amparm specifies the number of directory blocks to allocate when a partitioned data set is created. (See MVS/DFP Using Data Sets for more information on PDS directories.) The dir amparm applies only to partitioned data sets. If dir is specified in the amparms but the filename does not include a member name, a member name of TEMPNAME is assumed. If no dir amparm is specified and the filename does not include a member name, a sequential data set is created. If a member name is specified, a default dir value of 5 is assumed.
You do not have to specify the dir amparm if you request creation of a PDSE using the org=pdse amparm, since pre-allocation of directory blocks is not required.
The vol amparm specifies the volume serial on which a new data set will be created. If no vol amparm is specified for a new data set, the system is allowed to select the volume on which to create the data set.
The unit amparm specifies a unit name (such as SYSDA) to use when allocating a new file. This has the same effect as the JCL UNIT keyword. See the IBM publication MVS/XA ESA JCL Reference, for more information. If unit is not specified, the normal procedures at your site for unit selection are used. (For instance, for TSO users, the default unit name is defined as part of the user profile.)
rlse specifies whether unused file space should be released when the file is closed. This option only has an effect when the file is created at the time it is opened. The default is rlse=no. Use of rlse=yes with a file that is opened and closed repeatedly may cause the maximum file size to be smaller than if rlse=no had been specified, due to repeated release of temporarily unused space. You should not specify the rlse amparm for VSAM files.
dataclas specifies the data class for a new file. This amparm is ignored if SMS is not active, or if the file already exists. Your site may choose to ignore this specification.
storclas specifies the storage class for a new file. This amparm is ignored if SMS is not active, or if the file already exists. Your site may choose to ignore this specification.
mgmtclas specifies the management class for a new file. This amparm is ignored if SMS is not active or if the file already exists. Your site may choose to ignore this specification.

See the IBM publication MVS/DFP Storage Administration Reference for further information on SMS concepts such as data class and storage class.

Amparms - CMS details

This section discusses amparms under CMS. It provides an explanation of each amparm, including default values.

File characteristics For input files, or output files in which some or all of a file's previous contents are preserved, the file characteristics amparms serve as advice to the library about the file characteristics expected. If the actual file does not match the program's assumptions, a warning message is generated. To determine the characteristics of a file, the library tries to merge information from the amparms, the FILEDEF options (for a ddn-style filename), and from an existing file with the same fileid.

The processing for each of the file characteristics amparms is described here. Unless otherwise specified, the description is for CMS disk files.

recfm: If recfm=f, the program expects records of equal length. If the file is not a RECFM F file, a warning message is generated. Similarly, if recfm=v, the program expects records of varying length. If the file is not a RECFM V file and the file is opened for write-only or for update, a warning message is generated. recfm=u is treated as if it were recfm=v.
reclen: For files with fixed length records, reclen specifies the length of the records. If the record length specified by reclen does not match the LRECL of the file, a warning message is generated. reclen=x cannot be used with fixed format files.
If the file has varying length records, reclen specifies the maximum length of the records. If the LRECL of the file exceeds that specified by reclen, a warning message is generated. reclen=x implies that the records may be of any length up to 65,535.
blksize: The blksize amparm is used with the "rel" access method to specify the internal buffer size used by the library, which in turn specifies the number of records read or written by the library in each I/O operation. The blksize for the file should be a multiple of the file's logical record length. If it is not, it is rounded to the next higher multiple.
For files with fixed-length records opened for read only, blksize can also be used with the "seq" access method. In this case, blksize specifies the library's internal buffer size. If the buffer size is larger than the LRECL of the file, each input operation performed by the library reads as many records as can fit into the buffer. For example, if the file has 80-character records, specifying blksize=4000 causes the library to read 50 records at each input operation.

When an existing write-only or update file is opened, and none of the file's previous contents are preserved, the old file is erased and a new file is created. The characteristics of the new file are those specified by the amparms.

recfm: recfm=f and recfm=v cause the file to be created as RECFM F or RECFM V, respectively. Again, recfm=u is treated as if it were recfm=v.
reclen: specifies the maximum (and minimum, for recfm=f) logical record length for the file. reclen=x indicates that the records may be of any length up to CMS's maximum of 65,535.
blksize: specifies the buffer size used by the library when performing I/O operations on the file.

If the file characteristics are not completely described by the amparms, the FILEDEF options (when the ddn-style filename is used), or the file, the following defaults apply:

If the "rel" access method is in use, the amparms recfm=f, reclen=1, blksize=4080 are assumed.
For the "seq" access method, choices are based on the virtual device type, filetype, and the presence of the print amparm.
For files with filetype LISTING, files written to the virtual printer, or files with print=yes specified, the defaults are recfm=v and reclen=132.
If the file is written to the virtual punch, the defaults are recfm=f and reclen=80.
For other files, if the recfm amparm is not specified, the default is "v". If the recfm is "f", the default reclen is 80; otherwise, the default is reclen=x. For files in OS-format or for tape files, amparms are processed as described in Amparms - MVS details . The single exception is that the default blksize for tape files is 3600, rather than 6144.

File usage The amparm print has two uses: it specifies whether the corresponding file includes ANSI carriage control characters, and it specifies whether the C library should process the control characters '\r' and '\f' to effect page formatting. If print=no is specified, then '\r' and '\f' are simply treated as data characters, even if the file can support page formatting. If you specify print=yes, then the library attempts to perform page formatting, but if the associated file does not have the correct attributes, '\r' and '\f' are treated as new lines. (A warning message is generated when the file is opened in this case.) If neither print=no nor print=yes is specified, the library chooses a default based on the attributes of the external file. However, print=yes is supported only for a file accessed as a text stream.

Under CMS, any disk file may be used with page formatting. If the filetype of the file is LISTING or the file is written to the virtual printer, the file is assumed to require ANSI control characters in the first byte of each record. (If a disk file has control characters in byte 1 and does not have a filetype of LISTING, the CMS command PRINT prints the file incorrectly unless the CC option is used.)

The page amparm, which specifies the number of lines to be printed on each page of a print=yes file, does not have a default. That is, if you do not specify page, page ejects will occur only when a '\f' is written to the file.

If the access method is "rel", the default is pad=null.
If the access method is "seq" and the file is accessed as text, the default is pad=blank.
If the access method is "seq" and the file is accessed as binary, the default is pad=no; that is, padding is not performed.

The trunc amparm indicates to the library whether the program requires that output before the end of file erase following data or leave them unchanged. CMS disk files support both trunc=yes and trunc=no. By default, trunc=no is assumed; that is, data are not erased following a modified record. Shared file system files support only trunc=no. All other types of files support only trunc=yes, except for VSAM, which supports only trunc=no. If a program's trunc specification is not supported for the file being opened, the open fails, and a diagnostic message is generated.

File Positioning

As described in Technical Background , the 370 operating systems provide a relatively inhospitable environment for the standard C file positioning functions. For this reason, you should read this section carefully if your application makes heavy or sophisticated use of file positioning. Some understanding of MVS or CMS I/O internals is helpful.

The details of file positioning depend heavily on the I/O package and library access method used, the stream type (text or binary), and the file organization and attributes. The following discussion is organized primarily by I/O package.

File positioning with UNIX style I/O

When UNIX style I/O is used to access a file as binary, file positioning is fully supported with the lseek function, which can be used to seek to an arbitrary location in the file. However, when UNIX style I/O is used to access a file as text, the seek address is interpreted in an implementation-specific way, as when the fseek function is used. This means that, for a text stream, you should use lseek only to seek to the beginning or end of a file, or to a position returned by a previous call to lseek.

The lseek function accepts three arguments: a file number, an offset value, and a symbolic value indicating the starting point for positioning (called the seek type). The seek type is interpreted as follows:

If the seek type is SEEK_SET, the offset is interpreted as an offset in bytes from the start of the file.
If the seek type is SEEK_CUR, the offset is interpreted as the offset in bytes from the current position in the file.
If the seek type is SEEK_END, the offset is interpreted as the offset in bytes from the end of file.

If the seek type is SEEK_CUR or SEEK_END, the offset value may be either positive or negative. lseek can be used with a seek type of SEEK_SET and an offset of 0 to position to the start of a file, and with a seek type of SEEK_END and an offset of 0 to position to the end of a file.

Positioning beyond the end of a binary file with lseek is fully supported. Note that positioning beyond the end of a file does not in itself change the length of the file; you must write one or more characters to do this. When you write data to a position beyond the end of file, any unwritten positions between the old end of file and current position are filled with null characters ('\0').

The lseek function returns the current file position, expressed as the number of bytes from the start of the file. It is frequently convenient to call lseek with a seek type of SEEK_CUR and an offset of 0 to obtain the current file position without changing it.

Recall that, except for files suitable for "rel" access and OpenEdition HFS files, using UNIX style I/O is relatively inefficient. See Choosing I/O Techniques and File Organization for more information on the advantages and disadvantages of using UNIX style I/O.

File positioning with standard I/O (fgetpos and fsetpos)

Standard I/O supports the fgetpos and fsetpos functions to obtain or modify the current file position with either a text or a binary stream. Both fsetpos and fgetpos accept two arguments: a pointer to a FILE object and a pointer to an object of type fpos_t. For fsetpos, the fpos_t object specifies the new file position, and for fgetpos, the current file position is stored in this object. The exact definition of the fpos_t type is not specified by the ISO/ANSI C standard and, if you intend your program to be portable, you should make no assumptions about it. However, an understanding of its implementation by the SAS/C library can be useful for debugging or for writing nonportable applications.

The library defines fpos_t using the following typedef:

 typedef struct {
    unsigned long _recaddr;    /* hardware "block" address */
    long _offset;              /* byte offset within block */
 } fpos_t;

The first element of the structure (_recaddr) contains the address of the current block or record. The exact format of this value is system- and filetype-dependent. The second element of the structure (_offset) contains the offset of the current character from the start of the record or block addressed by _recaddr. In some cases, this offset may include space for control characters.

A more precise definition of these fields for commonly used file types follows.

For a CMS disk file processed by the "seq" access method, the _recaddr is the current record number, and the _offset value is the offset of the current character from the start of the record. Record numbering starts at 1, not 0.
For a VSAM ESDS, the _recaddr is the RBA (relative byte address) of the current record, and the _offset value is the offset of the current character from the start of the record.
For an MVS disk file processed by the "seq" access method, the _recaddr is the TTR of the block preceding the block containing the current record. If the file is a PDS member, the TTR is computed relative to the start of the member, not the start of the PDS. The _offset value is the offset of the current character from the start of the block containing the current record. In computing the offset, each record is treated as if it were terminated with a single new-line character, even for a file accessed as a binary stream. This technique allows the library to easily distinguish the end of a record from the start of the next record.
For files processed by the "rel" access method and OpenEdition HFS files, the exact values in _recaddr and _offset depend on the file attributes and on previous processing. You should use fseek and ftell rather than fsetpos and fgetpos for these files if you need to construct file positions manually.

fsetpos and fgetpos are implemented to be natural and efficient, and not to circumvent limitations or peculiarities of the operating systems. For this reason, you should be aware of the following:

In special cases in which the operating system or I/O device does not provide adequate support, fsetpos and fgetpos may fail. These cases are outlined in Tables 3.5 and 3.6. When fsetpos or fgetpos cannot be supported, a diagnostic message is generated, and the return value from the function indicates that an error occurred. A proper call to fsetpos will never indicate success when the requested operation is not supported by the file.
The effect of seeking past the end of the file or past the end of a record is undefined. The file type and the situation determine whether this error will be detected. Because fgetpos never returns a file position of this sort, this situation can arise only if your program manufactures its own fpos_t values. Seeking past the end of file is supported for files processed by the "rel" access method. For these files, writing a character beyond the end of file causes intervening positions to contain hexadecimal 0s.
Positioning using an invalid _recaddr value is frequently undetected, or causes an error only when the file is next read or written.
Writing after seeking to a position before the end of file may cause any following data to be discarded, depending on the file type and whether the trunc amparm is specified when the file is opened.
You should not compare fpos_t values. In some cases, especially for files with spanned records, several different values may identify the same location in the file.
Because of the use of TTRs under MVS, file positions may differ between copies of the same file. Similarly, VSAM RBAs may change if the control interval or control area size is changed.

File positioning with standard I/O (fseek and ftell)

In many cases, when you process a file with standard I/O, you can use the fseek and ftell functions to obtain or modify the current file position. Because fsetpos and fgetpos are relatively new additions to the standard C language, fseek and ftell are more portable. However, they are also more restricted in their use. Full fseek and ftell functionality is available only when you use the "rel" access method, when a file is accessed as a text stream, or when the file is in the OpenEdition hierarchical file system. For files processed with the "rel" access method and for HFS files, fseek and ftell function exactly like lseek does for UNIX style files.

The fseek function accepts three arguments: a pointer to a FILE object, an offset value, and a symbolic value indicating the starting point for positioning (called the seek type). The offset value is a long integer, whose meaning depends on whether the file is accessed as text or binary. For binary access, the offset value is a number of bytes. For text access, the offset value is an encoded file position whose interpretation is unspecified. The seek type is one of the values SEEK_SET, SEEK_CUR, or SEEK_END, indicating positioning relative to the start of file, the current position, or the end of file, respectively.

When you access a file as binary, the fseek offset value is simply interpreted as an offset from the point specified by the seek type. For instance,
fseek(f, -50L, SEEK_CUR) requests positioning 50 characters before the current character.

Note: Because the fseek offset value has type signed long, only files whose size is less than 2**31 bytes can be supported in a portable fashion. However, for files accessed using the "rel" access method, or stored in the OpenEdition HFS, the offset value is interpreted as unsigned long, thus allowing the use of files whose size is less than 2**32-1.

When you access a file as text, only certain combinations of offset and seek type are meaningful. When the seek type is SEEK_CUR or SEEK_END, only an offset value of 0 is meaningful, requesting no change in positioning or positioning to the end of file, respectively. When the seek type is SEEK_SET, any valid file position previously returned by ftell is accepted as an offset. Additionally, you can use an offset of 0 with SEEK_SET to reposition to the start of the file.

The ftell function accepts one argument, a pointer to a FILE object, and returns a long integer defining the current file position. For a file accessed as binary, the returned file position is the number of bytes from the start of the file. For a file accessed as text, the returned file position is in an internal format.

Note: For a text stream, ftell is the only safe mechanism for obtaining a file position for later use by fseek; that is, you cannot construct meaningful file positions in another way.

ftell computes the encoded file position for a text stream by first calling fgetpos to obtain the file position in fpos_t format. (Thus, if fgetpos is not usable with a file, neither is ftell.) Then, initial portions of the _recaddr and _offset fields of the result are combined in a manner that depends on the file organization. Because the size of an fpos_t value is 8 bytes and the size of a long integer is only 4 bytes, information is lost if either the _recaddr value or the _offset value is too large. The library detects this loss of information and returns an error indication, rather than an incorrect file position. For specific file types, the conditions under which a file position cannot be successfully returned are listed here:

an MVS disk file or PDS member larger than 256 tracks
a CMS-format disk file containing more than 65,535 records
a tape file containing more than 65,535 blocks
a VSAM file containing more than 16,777,215 characters, or a record longer than 255 bytes.

Even for files that exceed one of these limits, ftell returns an error only when the actual file position is outside the limits.

When used with a binary stream, full fseek functionality is restricted to the "rel" access method, but note that fseek with a 0 offset value is usually supported. This means that you can use fseek with almost any file to rewind, position to end of file, or switch between reading and writing. The exceptions are noted in Tables 3.5 and 3.6.

Table 3.5 MVS Files with Restricted Positioning

   
   File Type                    Restrictions

   terminal                      positioning not allowed

   card reader                   positioning not allowed

   OpenEdition pipe              positioning not allowed

   DD */DATA                     only fseek with a 0 offset
                                 supported

   printer/card punch/SYSOUT     rewind and seek to the end
                                 of file accepted but
                                 ignored

   keyed VSAM (KSDS)             only fseek with a 0 offset
                                 supported

   PDS member                    seek to the end of file
                                 not supported; switch
                                 between reading and
                                 writing only supported at
                                 the start of file unless
                                 grow=no

   PDSE member                   seek to the end-of-file
                                 not supported; seeking
                                 other than rewind only
                                 allowed if read-only or
                                 grow=no

   RECFM=FBS (accessed as        seek to the end of file
   text)                         not supported

   multivolume disk or tape      only rewind supported if
   file                          not opened for append;
                                 only seek to the end of
                                 file supported if opened
                                 for append

   unlabeled tape with           only seek to the end of
   DISP=MOD                      file supported

   concatenated sequential       only fseek with a 0 offset
   files                         supported

Table 3.6 CMS Files with Restricted Positioning

 
   File Type             Restrictions

   terminal               positioning not allowed

   reader/printer/punch   rewind and seek to the end of file
                          accepted but ignored

   keyed VSAM (KSDS)      only fseek with a 0 offset
                          supported

   MVS PDS member         positioning not allowed with
                          %MACLIB or %TXTLIB filename used

Note: The warnings given in the previous section for use of fsetpos are equally applicable to fseek.

Terminal I/O

Performing I/O on an interactive device such as a TSO or CMS terminal is quite different from performing I/O on a disk or tape file. Some programs need to use terminal and nonterminal files interchangeably, while others need to take advantage of the special properties of terminal files. Some specific differences between terminal I/O and other I/O follow.

Note: The following considerations apply when you read from or write to the TSO or CMS terminal. They do not necessarily apply when you read or write from an OpenEdition terminal under the OpenEdition shell. The behavior of an OpenEdition terminal is defined by the POSIX standards. See the POSIX 1003.1 standard for further information.

A terminal file can be opened only with open mode 'r', 'w', or 'a'. Open for append and write-only are treated identically. If you want to both read from and write to the terminal, you must use two files.
The distinction between text access and binary access for a terminal file is quite different from the distinction for other file types. Binary access should be used only when it is necessary to send or receive terminal control characters. (See Text access and binary access .)
In theory, a terminal file has no defined end. In practice, many programs require a way that the end of file can be signalled from the terminal. When a terminal file is accessed as text, the string "EOF" (uppercase letters required) is normally interpreted as the end of file. (This string may be changed by specifying the eof amparm when the file is opened.) When a terminal file is accessed as binary, a null input line will be treated as end of file.
You must use the "term" access method for terminal I/O.
When UNIX style I/O is used to process a terminal file, the file is always accessed as text.

Buffering, flushing, and prompting

Most implementations of standard I/O perform buffered I/O; that is, characters are collected in a buffer and transmitted, one block at a time. This can cause problems for I/O to the terminal. For instance, if a terminal output file is buffered, it is possible for a terminal read to be issued before an output message asking the user to enter the input is transmitted. To write correct and portable interactive programs, it is important to understand the different ways that terminal I/O can be implemented. Some of the possible approaches are as follows:

Some implementations require the programmer to use the fflush function to force an output buffer to be written. Programs that use this technique are portable, because this works with almost any C library implementation.
Some implementations buffer terminal output on a line-at-a-time basis. This means that as long as a program writes complete lines of output (each one terminated with a new-line character), there is no problem with delayed messages.
Some implementations inhibit buffering for terminal files. This technique avoids the problem, but it is not practical under MVS or CMS.
Some implementations automatically flush terminal output buffers before terminal input is requested. This is the technique used by the SAS/C library. Other implementations apply this technique only to the standard input and output streams (stdin and stdout). This implementation allows many simple programs to run as intended without solving the general problem.

Another situation that varies from one implementation to another, depending on the buffering strategy, is the effect of writing characters to several terminal files at the same time. In implementations that do not buffer terminal I/O, all the output characters are transmitted in the order that they are written. In an implementation that performs buffering, the output probably consists of complete lines of output, each line associated with a particular file. With the SAS/C library I/O functions, the buffer for one terminal output file is flushed when a character is written to another. This means that characters are transmitted in the same order as they are written and that characters written to two different files do not appear in the same output line.

Automatic prompting

When you use scanf or a similar function to read input from the terminal, it can be difficult to write prompts at all points where terminal input is required. For example, the following call reads two integers from the standard input stream (normally the terminal):

 scanf("%d %d", &i, &j)

If only one integer is entered, scanf issues another read to obtain the second integer without returning to the program. This means that the program is unable to issue a prompt or message to tell you that more input is required.

When you open a terminal input file, the library allows you to specify a prompt that will be written by the library immediately before a read is issued to the terminal. This allows each input request to be automatically preceded by a prompt. You can also have more than one terminal input file, each with a different prompt, allowing you to easily distinguish one file from another. This feature is requested with an amparm when you use afopen or aopen to open the file.

Text access and binary access

One of the distinctions between text access and binary access to a nonterminal file is the treatment of control characters. Certain control characters, such as the new-line and form feed characters, may be transformed during text input and output. Binary access is required to transmit control characters without alteration.

The situation is similar for input and output to the terminal. Usually, the terminal is accessed as a text stream. In this mode, the new-line character separates lines, and a form feed issues several blank lines to simulate the effect of a form feed on a printer. Other control characters are removed from input and output. Both TSO and CMS are sensitive to the data sent to the terminal, and incorrect use of control characters can cause the user to be disconnected or logged off. Thus, the removal of control characters is a safety measure that prevents unpleasant consequences when uninitialized data are sent to the terminal.

However, some applications need the ability to send control characters to the terminal. This is supported by accessing the terminal as a binary stream. When a binary stream is used, output data are sent to the terminal without modification, and input data are only minimally edited. The new-line character still separates lines of output, because both TSO and CMS support this use of the new-line character. Note that even though the SAS/C library does not modify data transmitted to and from the terminal when binary access is used, the data can be modified by system software, including the VM control program, VTAM, and communication controller software. Also recall that incorrect data can cause disconnection or other errors, so use this technique with caution.

Terminal I/O under MVS-batch

Programs that open the terminal for output or append can be run in MVS-batch. In this case, the output is written to the file defined by the DDname SYSTERM. The SYSTERM data set must have record format VBA and LRECL 137, but any BLKSIZE is permissible.

You may have several terminal output files open simultaneously. In this case, lines written to the various files are interspersed, as they would be if the program were running interactively. Emulation of terminal input under MVS-batch is not supported.

Terminal I/O amparms

When you use afopen or aopen to open a terminal file, you can specify the following amparms in addition to those discussed earlier in Opening Files :

eof=string: specifies an end-of-file string.
prompt=string: specifies a terminal input prompt.

The following amparms are ignored if the file to be opened is not the terminal:

The eof amparm enables you to specify the string that is interpreted as end-of-file if entered from the terminal. It applies only when the terminal is accessed as a text stream; when a binary stream is used, a null input line is always the end-of-file indicator, and the amparm is ignored.
Use of uppercase and lowercase letters is significant. For example, if eof=END is specified, an input line of end will not be interpreted as the end of file. The default specification for eof is eof=EOF; that is, the input line EOF will be interpreted as the end of file.
The prompt amparm enables you to specify a string that is sent to the terminal before data are read from the terminal with this stream.Use of uppercase and lowercase letters is significant. If the prompt string ends in a new-line ('\n') character under MVS, the cursor is positioned to the line following the prompt; otherwise, the cursor appears on the same line as the character after the prompt. Under CMS, the presence or absence of a trailing new-line character in the prompt is not significant, because the cursor is always positioned to the first position of the terminal input area.
With one exception, the default value for the prompt amparm is "prompt="; that is, no input prompting occurs. However, when the library opens the standard input stream (stdin), a prompt consisting of the program name followed by a colon is used. See stdin, stdout, and stderr for information on overriding this default.

Using the * filename under the shell

As an aid to porting existing programs to the OpenEdition shell, the SAS/C library allows the filename * in ddn, dsn, or tso style to access the OpenEdition terminal. When you use this filename, the eof and prompt amparms are permitted and honored. These amparms are ignored when you open the HFS file /dev/tty. Use of these amparms under the shell is not recommended for new programs, because they are not portable, but they can be useful when porting existing MVS programs.

Note: When a program running under the shell opens the * filename, no distinction is made between text and binary access. The effect of control characters on the display is defined by OpenEdition.

Using the OpenEdition Hierarchical File System

The OpenEdition Hierarchical File System (HFS) is an implementation of a UNIX file system under MVS. In this file system, a directory is a special kind of file that contains names and other information about a group of files. The root directory is at the top of the hierarchy; thus, the root directory is not contained in any other directory. Files within the file system are identified by a pathname, which consists of the series of directories (beginning with the root directory) that lead to a file. Directory names are separated by slashes (/), and the filename itself comes last. For example, the pathname /u/marie/tools/wrench.c identifies the file wrench.c, contained in the directory tools, in turn contained in the directory marie, contained in the directory u, which is contained in the root (/) directory. This type of pathname, beginning with a slash connoting the root directory, is called an absolute pathname.

A program using a hierarchical file system always has a current directory defined, either by inheritance from a calling program, or from using the chdir function. A pathname without a leading slash is called a relative pathname. Such pathnames are interpreted relative to the current directory. For instance, if a program's current directory is /u/marie and it attempts to open the pathname tools/wrench.c, the file that is actually accessed is /u/marie/tools/wrench.c.

Note: When you call the fopen or open function to access an HFS file, it may be necessary to prefix the pathname with the SAS/C style prefix hfs:. See File Naming Conventions for information on when this is required.

Several different kinds of files exist in the hierarchical file system. Most files are so-called regular files, which are stored on disk (in a special MVS file system data set). The hierarchical file system also contains special files of various sorts, which may not have all the properties of regular files. For instance, some special files do not support seeking, or have different behavior when read or written. Some important examples of special files include

the user's terminal (named /dev/tty).
the null device (/dev/null). Output to /dev/null is discarded; input from /dev/null produces end-of-file.
pipes, which are files used to communicate between processes. OpenEdition supports both named and unnamed pipes.

Low-level and Standard I/O

I/O to the hierarchical file system is implemented by OpenEdition MVS via a set of services that correspond to traditional UNIX unbuffered I/O calls, such as open, read, and write. For HFS files, UNIX style I/O functions interface directly to the operating system, bypassing most of the C library's I/O support. This ensures that access to the Hierarchical File System through SAS/C has the same characteristics as access when the operating system interfaces are used directly.

When an HFS file is opened using open, the operating system returns a small integer representing the file, called the file descriptor. All other I/O operations, such as reading and writing, are performed by specifying the file descriptor. File descriptors have two important properties not applicable to more traditional MVS files:

They may be shared between programs. When several programs are reading or writing the same file, the results are well-defined; whereas, with other MVS file types, the results are undefined, and generally undesirable. See the function description of fcntl for information on how several programs sharing a file can cooperate to avoid interfering with each other.
They can be inherited by a program from its caller. For example, the program dict1 could open a file using file descriptor 4, and then call the program dict2 using the execvp function without closing file descriptor 4. When dict2 began execution, file descriptor 4 would still be open, with the same file position as at the time of the exec, and dict2 could immediately read or write the file without having to open it again.

Of course, you can use standard I/O functions rather than the low-level functions like open, read, and write to access HFS files. However, program behavior may differ, depending on which set of routines you use. When you use fopen to open an HFS file, it calls the OpenEdition open interface, and then saves the resulting file descriptor in a control block accessed via the FILE pointer. Functions such as fread and fwrite read and write data from a buffer area allocated by the library (or by the user if the setvbuf function is used), and actually read from or write to the file descriptor only as necessary to empty or fill the buffer.

For most programs, the buffering performed by standard I/O results in a performance gain, because the program does not need to call the operating system as often. However, for some programs, this can result in unacceptable behavior. For example, programs that share files usually should not use standard I/O because output data may be buffered indefinitely; therefore, updates may not become visible to other programs using the file for an arbitrary amount of time. Similarly, if a program needs to receive an open file from a calling program, it must be aware that only the file descriptor is passed. That is, a FILE pointer is local to the program that creates it, and it cannot be inherited, except under special conditions.

For applications that might need to access a file using both low-level and standard I/O, the POSIX standards define two functions that cross the boundaries:

Use fdopen to associate a FILE pointer with an open file descriptor. For example, if the program dict2 receives open file descriptor 4 from its caller, it can use the following statements to associate the FILE pointer f with file descriptor 4. Thereafter, the program uses standard I/O functions to access the file.
```
 FILE *f;
 f = fdopen(4, "r+");
 
```
Use fileno to extract the file descriptor for an HFS file from the FILE pointer. This can be useful if a program using standard I/O has a momentary need for a low-level I/O feature not supported via standard I/O, such as the ftruncate function.

When a program is called by exec, the library automatically uses fdopen to associate the standard files stdin, stdout, and stderr with file descriptors 0, 1, and 2. Thus, these three files are partial exceptions to the rule stated earlier that FILE pointers cannot be inherited across exec.

OpenEdition I/O Considerations

OpenEdition support in SAS/C affects I/O in several ways. SAS/C now implements two different file-naming conventions. Also, DD statements can now be allocated to HFS files or directories. Finally, with OpenEdition support, you may find it useful to modify some programs to use PDS members and HFS directories interchangeably. These considerations are described in the next three sections.

File Naming Conventions

SAS/C implements two different file-naming conventions: one for use by traditional SAS/C programs, and one for POSIX-oriented programs. The choice of naming convention depends on whether any compilation in the main program load module specifies the posix compiler option. If so, then POSIX file-naming rules apply. If no compilation specifies the posix option, then traditional SAS/C naming conventions apply.

Using traditional SAS/C rules, a filename consists of a style prefix (one to four characters, followed by a colon), followed by the filename proper. The prefix determines how the rest of the filename is to be interpreted (for example, as a DDname or an HFS pathname). If there is no style prefix, then a default prefix is assumed. The default prefix may be defined by the program by initializing the _style external variable. If _style is not initialized, the default is system-dependent. A filename, with or without an explicit style prefix, may be further prefixed by the string //. If // precedes a style prefix, the // is simply ignored. If // is present, but there is no style prefix, then the style tso (in MVS) or cms (in CMS) is assumed, independent of the _style definition. When these rules are in effect, you must do one of the following to access a file in the HFS:

Prefix the path name with hfs: or //hfs:. For example, to access the file tools/wrench.c in the current directory, open hfs:tools/wrench.c or //hfs:tools/wrench.c.
Initialize the _style external variable to "hfs". Then simply use the pathname to open HFS files. However, you will have to use a style prefix for other kinds of names, such as DDnames.

The rules above are useful for MVS-oriented programs, or for programs that must open diverse kinds of files. However, they are often not the most appropriate rules for portable applications. Notably, the POSIX.1 standard requires that any pathname not beginning with // be interpreted as a hierarchical file system pathname. For this reason, SAS/C implements alternate conventions to allow the porting and/or development of applications that conform to the POSIX.1 standard and are portable to UNIX operating systems.

These alternate rules apply whenever the main load module of a program contains at least one compilation using the posix compiler option. For such programs, the file-naming conventions are as follows:

If the name of the file to be opened does not begin with exactly two slashes (//), it is the pathname of an HFS file, even if the name appears to have a style prefix. For example, the filename /u/marie/tools/wrench.c identifies an HFS file with that pathname, and ddn:sysin identifies a file with that name in the current directory.
If the name of the file to be opened begins with two slashes (//), it is interpreted exactly as it would be interpreted according to the traditional SAS/C rules above. That is, the filename //ddn:sysin identifies the DDname SYSIN, and the filename //tools.c(wrench) identifies the MVS PDS member userid.TOOLS.C(WRENCH), where userid is the current user's id.

Note: For a program compiled with the posix option, the _style external variable is ignored.

Note: Because filenames beginning with // are interpreted in the same way for applications compiled with the posix option as for those compiled without the posix option, this form should be used by any functions that need to open files, and which can be used in programs compiled with or without the posix option. For example, to open the HFS file /u/marie/tools/wrench.c without knowing whether the program was compiled with the posix option, use the filename //hfs:/u/marie/tools.wrench.c. Any such functions must not be compiled with the posix option themselves, because then any program using such functions would automatically follow the naming conventions for programs compiled with the posix option.

Accessing HFS files using DDnames

Enhancements to MVS JCL and dynamic allocation facilities for OpenEdition MVS allow DD statements to be allocated to HFS files or directories. Parameters on the DD statement correspond roughly to arguments to the open function: the PATH option corresponds to the pathname to be opened, the PATHOPTS option corresponds to the open flags, and the PATHMODE option corresponds to the file creation mode specification.

HFS files can be accessed using a ddn-style filename, as can any other MVS file. The following points should be noted:

The open or fopen options must be compatible with the DD statement. For example, if the DD statement specifies PATHOPTS=ORDONLY, but the fopen call specifies a mode of 'r+', the open will fail.
If you specify PATHOPTS=OCREAT on the DD statement or allocation and the specified file does not exist, the file is created at the time of allocation. This means that at the time the program calls open or fopen, the file already will have been created. In particular, if the DD statement specifies PATHOPTS=(OCREAT,OEXCL) and the open call also specifies O_CREAT+O_EXCL, the open will fail, because the file will have been created when the DD statement was processed.
DD statements that reference directories cannot be opened.
Concatenated DD statements in which one or more members are HFS files cannot be opened successfully.

Using HFS directories and PDS members interchangeably

Until the availability of OpenEdition, it was often convenient to replace the use of directories in UNIX applications with PDS's when porting them to MVS. Consider porting a UNIX C compiler to the mainframe. In UNIX, a system header file like <stdio.h> is simply a file in a particular directory. In MVS, such names are generally interpreted by treating the first part of the name as a member name, relative to a PDS defined by a DDname. (For example, SAS/C interprets <stdio.h> as ddn:syslib(stdio)). With the availability of OpenEdition, it may be desirable to modify these programs to use a PDS or an HFS directory interchangeably, as convenient for the user. SAS/C provides the following extension to its ddn-style filename handling in support of this. Besides all previously accepted forms, a ddn-style filename may now have the following form:

 ddname/filename

Here, ddname is a valid MVS filename, and filename is a valid POSIX filename (not containing any slashes). When ddn:ddname/filename is opened, the following occurs:

If the ddname is defined and allocated to an HFS directory dirname, the file dirname/filename is opened. For example, if the DDname SYSLIB references the directory /usr/include, then opening ddn:syslib/stdio.h is the same as opening hfs:/usr/include/stdio.h.
If the ddname is defined and allocated to an MVS PDS, the file ddname(member) is opened, where the member name is the same as filename, discarding the first period in the name and all succeeding characters, and truncating the remainder of the name to eight characters. For example, if the DDname SYSLIB references a PDS, then opening ddn:syslib/stdio.h is the same as opening ddn:syslib(stdio).
If the ddname is undefined, or it references some other kind of file, the open fails.

Note: When the ddname/filename syntax is used and the DDname references an HFS directory, any PATHOPTS specified on the DD statement apply to the subfile as well. Thus, if DDname SYSLIB specifies PATHOPTS=OWRONLY, opening ddn:syslib/stdio.h using open mode 'r' will fail.

Using Environment Variables in place of DDnames

When a new process is created by fork or exec, as when a program is called by the shell, a new address space is created with no DD statements allocated other than possibly a STEPLIB. For programs exclusively using UNIX oriented interfaces, this does not present a problem, but it can present difficulties for porting existing MVS applications to run under the shell. For this reason, the SAS/C library permits you to substitute environment variables for DDnames in programs invoked by the exec system call.

For a program invoked by exec, if an attempt is made to open a DDname (for example, using the filename //ddn:anyfile), if no corresponding DD statement exists, the library checks for an environment variable named ddn_ANYFILE. Notice that the prefix ddn is always in lowercase letters, while the DDname proper is always in uppercase letters. The value of the environment variable, if it exists, must have one of two forms:

If the environment variable value does not begin with a slash, the value is translated to uppercase letters and then interpreted as a fully qualified MVS dataset name. For example, if the value of ddn_MACLIB is sys1.maclib(dcb), an fopen of //ddn:maclib is treated as if the call specified //dsn:sys1.maclib(dcb). However, any MVS dataset specified via a ddn_ environment variable must already exist; that is, the library will not create a new data set while processing an environment variable. However, you can reference a nonexistent member of an existing PDS.
If the environment variable begins with a slash, the value is interpreted as an HFS absolute pathname. For example, if the value of ddn_MACLIB is /usr/include/stdio.h, an fopen of //ddn:maclib is treated as if the call specified //hfs:/usr/include/stdio.h. A ddn_ environment variable can reference a nonexistent HFS file, which will then be created when the DDname is opened (if permitted by the fopen options).

When a ddn-style filename is opened using an environment variable, the specified DDname is allocated by the library during processing. Thus, if the same program opens the DDname a second time, a DD statement will be found, and the environment variable will not be referenced again. Consequently, changing the environment variable after it has been used to open a file will be ineffective.

Note: The ddn:ddname/filename pathname format described above can be used both with DDnames defined by an environment variable and with actual DD statements.

File descriptor allocation

Whenever a file is opened using the open system call, the POSIX.1 standard requires that the call be assigned the lowest file descriptor number that is not in use by an open file. Under OpenEdition, the range of valid file descriptors is from 0 to a maximum defined by the site. The default maximum is 64, but it can be set by the site to be as low as 16 or as high as 65,536. The maximum number of open OpenEdition files can be determined using the sysconf function.

The limit on the number of open file descriptors is unrelated to the library's limit on the number of FILE pointers that may be opened using standard I/O. This limit is always 256, regardless of the OpenEdition limit. File descriptors in the valid OpenEdition range can be assigned to files other than OpenEdition files in two situations:

The library treats the FILE pointers stdin, stdout, and stderr as being file descriptors 0, 1, and 2, whether or not these are HFS files.
The SAS/C socket library assigns sockets file descriptor numbers in the OpenEdition range, because many socket programs assume that socket numbers are allocated using the rules for UNIX file descriptors.

In both of these cases, confusion can occur. For example, if file descriptor 4 is assigned to a socket and you call open, OpenEdition could assign file descriptor 4 to the newly opened file, and then the library could not distinguish a request to write to file 4 from a request for socket 4.

The library solves this problem using shadow files. Whenever the library needs to assign a file descriptor for a file that is not an OpenEdition file, it first opens /dev/null to obtain a file descriptor, which is then assigned to the socket. The shadow file is closed only when the socket or standard file is closed. Because OpenEdition associates the file descriptor with /dev/null, it will not be possible for OpenEdition to associate the descriptor with any other file. This technique also ensures that socket numbers are assigned in accordance with OpenEdition rules.

You should note the following points about file descriptor allocation:

This technique means that it is not possible to use more sockets and OpenEdition files combined than the maximum number of OpenEdition file descriptors. If this is a problem, it should be solved by raising the site file descriptor limit.
When you use the open function to open MVS files for UNIX style I/O, very large file descriptors are assigned, thereby preventing these files from affecting the OpenEdition file limit.
If you run more than one program that uses OpenEdition facilities in the same address space, they share all open file descriptors, except when a new process is created using oeattach. In such cases, file descriptors may not be assigned in the order specified by POSIX.1. This mode of operation is not recommended, because the sharing of file descriptors (and other data, such as signal handlers) between the two programs can lead to very confusing results.

stdin, stdout, and stderr

The C language definition specifies that when program execution begins, three standard streams should be open and available for program use. These are stdin, the standard input stream, stdout, the standard output stream, and stderr, the standard error stream. A number of C library functions, such as puts and scanf, are defined to use stdin or stdout automatically, without requiring you to explicitly specify a FILE pointer. Note that the standard streams are always opened for text access.

stdin, stdout, and stderr are implemented as macros, not as true variables. For this reason, you cannot assign them new values. If you want to reopen one of the standard streams, you must use the freopen or afreopen function rather than fopen or afopen.

Whether the standard streams are actually used is determined by the program, with one exception. Library diagnostic messages are written to stderr, if it can be opened successfully and is suitable for output. If stderr is unavailable, library diagnostics are written to the terminal under CMS or TSO, and to the job log under MVS-batch.

Under CMS, all three standard streams are, by default, directed to the terminal. Under MVS, the default filenames for stdin, stdout, and stderr are ddn:sysin*, ddn:sysprint*, and *, respectively. stdin uses the DDname SYSIN, if it is defined, and the terminal, otherwise. Similarly, stdout uses SYSPRINT, if it is defined, and the terminal, otherwise. stderr is directed to the terminal or to the DDname SYSTERM, if running in batch.

For a program running under OpenEdition MVS, by default stdin, stdout, and stderr are defined as file descriptors 0, 1, and 2, as passed by the calling program. If one or more of these file descriptors is not open in the calling program, any attempt to use the corresponding standard file in the called program will fail, unless it opens the appropriate file descriptors itself.

Under MVS, it is possible for one or more of the standard streams to fail to open. For instance, in batch, stdin cannot be opened unless you define the DDname SYSIN, and stderr cannot be opened unless you define the DDname SYSTERM. To avoid generating an "open failure" error message for a file that is never used, the library delays issuing a system open for a standard stream until it is first used. Note that opening a file under MVS requires significant memory. For this reason, if you write to a standard file when your program runs out of memory (for instance, when malloc fails), you may want to force the file to be opened earlier, as by writing an initial new line at a time when enough memory is known to be available.

Changing standard filenames at execution time

Because the standard streams are initialized by the library before execution rather than by an explicit call to fopen, there is no direct way to change the filenames associated with them. For this reason, C implementations traditionally support command-line redirection. This permits the user of a program to specify on the command line (that invokes the program) the filenames to be associated with standard input and output streams. For example, the CMS command line "xyz <ddn:input >printer" invokes the program XYZ, requesting that ddn:input be used as the filename for stdin, and that printer be used as the filename for stdout. Redirection is described in detail in SAS/C Compiler and Library User's Guide, Fourth Edition . Additionally, you should be aware of the following considerations:

Even when redirection is used, the standard streams are not opened by the operating system until necessary. Therefore, any errors in the filename specified by the redirection are not detected until the file is used. If the operating system cannot open the file, the program treats it like any other I/O error. You should call the ferror function to test for errors using a standard stream, just as for any other stream, to avoid wasting time trying to read or write a file that cannot be accessed.
Names specified with redirection that do not include a specific style prefix are ordinarily assumed to be DDnames under MVS, or cms style filenames under CMS. You can initialize the _style external variable to define a different default style, as described in Chapter 9 of SAS/C Compiler and Library User's Guide, Fourth Edition . The default style applies to all files used by the program, not just to the standard files.
When a program is invoked by the OpenEdition shell, redirections are handled by the shell, not by the SAS/C library. This means that redirections must be in the format defined by the shell, not by the SAS/C library. In particular, you cannot use a style prefix in a redirection for a program invoked by the shell.

Changing standard filenames and characteristics at compile time

Besides supporting command-line redirections, the library enables you to change the names of the standard files at compile time, or to specify amparms to be used when the files are opened. Thus, you can override some of the library defaults. If the program specifies a replacement filename and the command line includes a redirection for the same file, the filename specified on the command line is used.

To change the default name for a standard file, you must initialize an external char * variable with the filename to be used. The external variables are _stdinm, _stdonm, and _stdenm for stdin, stdout, and stderr, respectively. For example, the following declaration specifies that by default, _stdinm should read from the user's virtual reader:

 char *_stdinm = "cms:reader";

The _stdinm,_stdonm, and _stdenm specifications are honored even for programs called with exec. Thus, using these variables, you can override the standard use of file descriptors 0, 1, and 2 for these files if you wish. If you do this, the standard file descriptors are not closed, and can still be accessed directly via the file descriptor number.

Similarly, you can assign an initial value to the external variables _stdiamp, _stdoamp, or _stdeamp to specify the amparms to be used when stdin, stdout, or stderr is opened. The library default amparms are shown in Table 3.7:

Table 3.7 Default Amparms for the Standard Files

   File    Amparms

   stdin    prompt= pgmname :\n

   stdout   print=yes

   stderr   print=yes,page=60

You may want to override these default amparms in the following situations:

If stdin is defined as the terminal, a prompt of the form pgmname : (where pgmname is the program name or "" if the program name cannot be determined) is issued to the terminal before each read. If your program performs its own prompting, you may want to initialize _stdiamp to "" to suppress the library prompt.
Note: A standard prompt is not used when stdin is defined as file descriptor 0 (for a program called by exec), even if file descriptor 0 references the terminal.
Because the default stdout and stderr amparms include "print=yes", the library issues a warning message if the associated physical file does not support page formatting (for example, if it is an MVS data set whose record format does not include A). If you expect your program to be run with stdout or stderr associated with this type of file, you can initialize _stdoamp or _stdeamp to "print=no" to inhibit the diagnostic message.

Using the standard streams with UNIX style I/O

In UNIX operating systems and other similar systems, it is possible to access the standard streams using low-level I/O, specifying file numbers 0, 1, and 2 for stdin, stdout, and stderr, respectively. The library supports such access, provided that certain guidelines are followed. This usage is nonportable. The following restrictions apply:

Use file number 0 (stdin) for input only, and file numbers 1 (stdout) and 2 (stderr) for output only.
Do not use lseek on any of these files.
Do not close any of these files.
Avoid using the same file with both UNIX style I/O and standard I/O. For instance, do not issue both read to file 0 and fgetc to stdin in the same program.
When OpenEdition is in use, it is possible to create confusing associations between file descriptors in certain circumstances. For example, it is possible to cause file descriptor 0 to be associated with stdout, rather than with stdin. If you call a UNIX I/O function with a standard file descriptor that is not assigned by OpenEdition, and whose corresponding standard FILE pointer is associated with a different file descriptor, the library will reject the call rather than possibly access the wrong file.

I/O Error and Interrupt Handling

UNIX style I/O includes no specific error-handling functions or features. If a read, write, or lseek call fails, the only indication is the value returned by the function. Depending on the error, it may be possible to continue to use the file after the error occurs.

Error handling

As stated earlier, after a file has been opened, a pointer to a FILE object is used to identify the file. This pointer is passed to I/O routines such as fread and fwrite to indicate the file to be read or written. Associated with each FILE object is a flag called the error flag that indicates whether the most recent I/O request failed. When the error flag is set, it is not possible to use the file other than to close it or to call the clearerr function to clear the flag.

The error flag for a file is set whenever an error occurs trying to access a file. The flag is set for all types of errors, whether they are hardware errors (such as an unreadable tape block), errors detected by the operating system (such as a full CMS minidisk), or errors detected by the library (such as trying to read a write-only file). In addition to setting the error flag, the library also writes a diagnostic message to the stderr stream and sets the errno external variable to indicate the type of error that occurred.

The function ferror can be called to determine whether the error flag is set for a file. Using this function is sometimes necessary because some functions, such as fread, do not distinguish in their return values between error conditions and end of file.

If you want to continue processing a file after an error occurs, you must call the clearerr function to clear the error flag; that is, to cancel the effect of the previous error. Some errors (such as running out of disk space under MVS) are so severe that it is impossible to continue to use the file afterwards without reopening it. In such cases, clearerr is unable to clear the error, and continued attempts to use the file cause new errors to be generated.

I/O and signal processing

In a program that handles asynchronous signals, it is possible for a library I/O routine to be interrupted by a signal. When a library I/O routine is interrupted, an interrupt flag is set for the file until the signal handler returns. Any attempt to use the file while the interrupt flag is set is treated as an error (and therefore sets the error flag) to avoid damage to the file or to library file control blocks. The situations in which the interrupt flag is most likely to be set are after using longjmp to exit from a signal handler, or when a signal handler performs I/O to a file in use at the time of the signal. When the interrupt flag is set, you can call clearerr to clear it along with the error flag and continue to use the file.

For terminal input under MVS and CMS (except with OpenEdition), the system calls do not allow signals to be detected while the program is waiting for terminal input, with one exception. The SIGINT signal, which is an attention interrupt under MVS or an IC immediate command under CMS, terminates the terminal read and causes any handler to be called immediately. If your SIGINT handler needs to read from the terminal, you should use a different FILE pointer from the one used by the rest of your program; otherwise, the error flag is set for the file, as described in the previous paragraph. If you must use the same FILE pointer in mainline code and in your handler, you need to call clearerr in the handler before reading from the terminal and call it again after exit (either by return or by longjmp) from the handler.

Augmented Standard I/O

Some 370 I/O applications are beyond the scope of standard I/O because the record concept is absent from the C language. Consider, for example, a program to make an exact copy of any input file, including duplicating the input file's record structure. Such a program could not be written using binary file access because all information about the record structure of the input file would be lost. It also could not be written using text access, because if there were any new-line characters in the input file, they would be interpreted by the program as record breaks, and the output file would contain more records than the input file. The functions afread, afread0, afreadh, afwrite, afwrite0, and afwriteh have been defined to permit this sort of application to be written in C. These functions, together with afopen, afreopen, and afflush are known as augmented standard I/O.

afread and afwrite can only be used with binary streams. Because they are used with binary streams, they never translate or otherwise modify input or output data, even if the data include control characters. afread and afwrite are useful only when the "seq" access method is used, because a file processed with the "rel" access method is treated as a stream of characters without record boundaries. If you need to process files with fixed-length records using afread or afwrite, you should open the file with afopen, and request the use of the "seq" access method.

afread and afwrite

The afread and afwrite functions are very similar in form to the standard fread and fwrite functions: they accept as arguments a pointer to the input or output area, the size of the type of object to be read or written, the maximum number of objects, and the FILE pointer identifying the file. But, unlike fread and fwrite, whose purpose is simply to read or write the items specified without regard to record boundaries, the purpose of afread and afwrite is to read or write the items specified as a single record. Specifically, afread and afwrite read and write items as follows:

When afread is called, it reads items from the file until a record boundary is encountered. It reads, at most, the number of items specified, and it generates a diagnostic message if there are any further items in the record. It is permitted for the input record to contain fewer items than requested. In this case, afread reads as many as are present in the record, and returns the number of items read to its caller. This permits easy processing of files containing variable-length records with afread.
When afwrite is called, it writes all the items specified and then forces a record break to occur. An error message is generated if the items do not all fit in a single record, or if the file characteristics will not permit writing a record of that size.

afread and afwrite do not support zero-length records. On input, a zero-length record is ignored, and similarly, an attempt to write a zero-length record is ignored. Two alternate functions, afread0 and afwrite0, are provided. These functions can handle zero-length records, if the file being processed supports them. To support zero-length records, afread0 and afwrite0 use error-reporting conventions that are not compatible with the standard C fread and fwrite functions.

afread and afwrite do not require that the file be positioned to a record boundary when they are called. Also, you can freely mix calls to afread and afwrite with calls to other standard I/O routines, such as fscanf or fseek, if your application requires it. See the function descriptions for afread and afwrite for examples of their use.

afreadh and afwriteh

afreadh and afwriteh enable you to read or write a header portion of a record before calling afread or afwrite to process the remainder. This is useful for reading or writing files processed by another language (such as COBOL or Pascal) that supports variant records.

A variant record is a record composed of two parts, a fixed format part and a variable format part. The fixed format part contains information common to all records, and a field defining the length or structure of the remainder of the record. Depending on the situation, it may not be possible to read or write such records conveniently using afread and afwrite. (Defining the records to be processed as a C union is helpful only if all the different variants are the same size.) afreadh and afwriteh support processing such records in a straightforward way:

afreadh is similar to afread, except it does not require that a record break occur after the last item read. However, all items read must be contained in a single record, or an error message is generated. afreadh is most frequently used to read the first part of a variant record.
afwriteh is similar to afwrite, except it does not force a record break after the last item written. However, all the items written must fit into a single record, or an error message is generated. afwriteh is most frequently used to write the first part of a variant record.

See the function descriptions for afreadh and afwriteh for examples of their use.

Advanced MVS I/O Facilities

This section discusses several advanced I/O tasks under MVS, such as reading a PDS directory, recovering from ABENDs, PDSE access, and processing DIV objects.

Reading a partitioned data set directory

You can read a PDS directory sequentially by allocating the entire library to a DDname, and specifying the DDname without a member name, as the filename. For instance, you can use the following TSO code fragment to open the directory of SYS1.MACLIB for input:

 system("tso:alloc file(sysmacs) da('sys1.maclib') shr");
 direct = fopen("ddn:sysmacs", "rb");

You can also access the PDS directory by opening the PDS using a "dsn"- or "tso"-style name, and specifying the amparm "org=ps", as in

 direct = afopen("dsn:sys1.maclib", "rb", "seq", "org=ps");

The directory is treated by the library as a RECFM=F, LRECL=256 data set, regardless of the attributes of the members.

You cannot use C I/O to modify a PDS directory. Also, access to a PDS directory is supported using only ddn-style filenames, unless the org amparm is used. If you specify a PDS using a dsn- or tso-style filename without an org specification, and no member name is present, the member name TEMPNAME will be used.

Recovering from B37, D37, and E37 ABENDs

When an I/O operation requires additional space to be allocated to a file but space is unavailable, the program is normally terminated by the operating system with a B37, D37, or E37 ABEND. The SAS/C library intercepts these ABENDs and treats them as error conditions. It sets the error flag for the affected file and returns an error code from the failing I/O function. The ABEND is intercepted using a DCB ABEND exit, not a STAE or ESTAE, and functions correctly even if you use the nohtsig run-time option to suppress the library's ESTAE.

When the library recovers from one of these ABENDs, the file is automatically closed by the operating system. For this reason, the error flag is set permanently; that is, you cannot clear the flag with clearerr and continue to use the file. An exception is made by the "rel" access method, which reopens the file if you use clearerr to clear the error condition. This enables you to read or modify data you have already written, but you cannot add any more records to the file, because this simply will cause the ABEND to reoccur.

Although other kinds of I/O errors are quite rare, these out-of-space ABENDs occur frequently, even for production programs. Therefore, you should always check output operations for success to avoid loops when trying to write to a file that can no longer be accessed.

Using a PDSE

Recent releases of MVS/ESA have introduced a new implementation of extended partitioned data sets, called a PDSE (Partitioned Data Set Extended). These files are compatible with ordinary PDS data sets, but have a number of advantages, including the following:

Space within a PDSE is allocated dynamically, so PDSEs do not require compression.
Several members of a PDSE can be written at the same time. Different programs can write different members of the same PDSE without interfering with each other.
The directory for a PDSE does not have to be allocated in advance; therefore, a PDSE can expand indefinitely without running out of directory blocks.

The SAS/C Library includes support for PDSEs. Most programs that presently access PDS members can access PDSE members without change.

Restrictions

Although PDSEs are compatible in most ways with standard PDSs, they do not support either BSAM INOUT or OUTIN processing, which enable a member to be read and written at the same time. When the fseek or fsetpos functions are used on a PDS member, the library depends on this processing, except for a read-only file. For this reason, the use of fseek or fsetpos on a PDSE member is not supported unless the member is read-only, or unless you specify grow=no. One exception is that fseek(f, 0L, SEEK_SET) can always be used to reposition a PDSE member to the start of file.

Note: When a PDSE member is accessed through UNIX style I/O in binary mode, this restriction does not apply. In this case, full use of the lseek function for repositioning is supported.

Access via the grow=amparm

The SAS/C Library defines the amparm grow, which can be specified when a file is opened for 'r+' or 'r+b'. You specify grow=no to inform the library that the program will only replace existing records of a file, rather than adding any data to the end. When you specify grow=no for a PDSE member, the library can open the member for UPDAT rather than OUTIN and can then support use of either the fseek or fsetpos function.

The grow amparm is also supported for standard PDS members, and it should be used where possible, because it performs an update-in-place action, and avoids wasting the space in the PDS occupied by the previous member.

Allocating PDSEs

When a new partitioned data set is created, the decision to create it as a regular PDS or as a PDSE is normally determined by your site, possibly based on data set name or other data set characteristics. In some cases, you may want to force a particular choice. The org amparm supports this. org has more uses than just PDS allocation. See Opening Files for more information.

When you use the afopen function to create a new PDS, you can specify one of three values for org

po: specifies that the file is a PDS and that normal site criteria should be used to select between a regular PDS and a PDSE.
pds: specifies that the file should be created as a regular PDS.
pdse: specifies that the file should be created as a PDSE.

Note: A site may choose to ignore a program's request for a particular type of PDS, although this is fairly unusual. For this reason, it cannot be guaranteed that org=pds or org=pdse will be honored in all cases. If your operating-system level does not support PDSEs, the org values pds and pdse will be treated like the value po.

Using VSAM linear data sets (DIV objects)

A DIV object is different from other MVS-type data sets. Essentially, it is a single stream of data with no record or block boundaries. The operating system processes the file in 4096-byte units with paging I/O, mapping the data in the file to virtual storage referred to by the program (all of which is transparent to the program). For more information on DIV objects, see the IBM manual MVS/ESA Application Development Guide.

You can access DIV objects using the ordinary C library I/O functions and the "rel" access method. Two amparms are available for use with VSAM linear data sets. These amparms are not required, but they allow the program to direct the internal buffering algorithm used by the library:

bufsize= nnn: specifies the size, in bytes, of a DIV window.
bufmax= n: specifies the number of DIV windows.

The value specified for bufsize is rounded up to a multiple of 4096. The default value for bufsize is bufsize=262144 (256K). The default value for bufmax is bufmax=4. These default values can be modified by your site; see your SAS software representative for C compiler products for more information about the default values for bufsize and bufmax. This discussion assumes the default values have not been modified.

DIV windows The library allocates one window when the object is opened. This window is mapped to the beginning of the object. When a reference is made to a location that is outside the bounds of the window, the library allocates a new window that maps the location.

New windows can be allocated, until the number specified by bufmax is reached. Then, if a reference is made to a location that is not mapped by any window, the library remaps the least-used window to the new location. The least-used window is the window that has the fewest references made to locations that it maps.

If the limit specified by bufmax has not been reached, but there is insufficient storage available to allocate a new window, the library issues a warning and begins remapping existing windows.

How the amparms are used As with other amparms, the linear data set amparms may be specified in the amparms argument to afopen or aopen. If one of the amparms is omitted, then the library uses its default value.

If a linear data set is opened with fopen or open, or neither amparm is used, then bufsize is calculated from the object size divided by 4, rounded to a multiple of 4096 as necessary. If the data set has size 0 (that is, the data set is empty), the default values are used. If there is insufficient storage available to allocate the first window, the library issues a warning and uses whatever storage is available.

Advanced CMS I/O Facilities

This section discusses several advanced I/O tasks under CMS, such as use of xed style files, extending global MACLIB/TXTLIB processing, and using the CMS shared file system.

The xed filename style

The CMS version of the library supports access to files being processed by XEDIT with the xed filename style. xed style filenames have the same format as cms-style filenames, for example, xed:payroll data a. The filename must identify a CMS disk file. That is, you cannot specify device names such as PRINTER. Also, you cannot use the MEMBER keyword (or its abbreviated format equivalent).

You can use the xed style even when XEDIT is not active. In this case, or when the file requested is not in the XEDIT ring, the file is read from disk.

See the system function description and Chapter 2, "CMS Low-Level I/O Functions," in SAS/C Library Reference, Volume 2 for information on other facilities that may be useful for programs that use XEDIT files.

Extensions to global MACLIB/TXTLIB processing

As described previously, you can use the cms-style filenames %MACLIB (MEMBER name ) and %TXTLIB (MEMBER name ) to access members of global MACLIBs or TXTLIBs. Global MACLIBs and TXTLIBs are established using the CMS GLOBAL command. Here is an example:

 GLOBAL TXTLIB LC370 MYLIB1 MYLIB2

When %TXTLIB( name ) is opened, the libraries LC370 TXTLIB, MYLIB1 TXTLIB, and MYLIB2 TXTLIB are searched, in that order, for the member name . Also, the library implements several extensions to standard CMS GLOBAL processing to support larger numbers of global libraries than allowed directly by the CMS GLOBAL command. These extensions also support the use of OS partitioned data sets as global libraries.

One extension to GLOBAL processing enables you to issue a FILEDEF using the DDname CMSLIB and then include CMSLIB in the list of files for the GLOBAL command. This causes the files associated with the CMSLIB DDname to be treated as global. For example, if you issue the following commands, the same set of libraries as in the previous example is defined, and the effects of opening %TXTLIB( name ) are the same:

 FILEDEF CMSLIB DISK MYLIB1 TXTLIB A
 FILEDEF CMSLIB DISK MYLIB2 TXTLIB A (CONCAT
 GLOBAL TXTLIB LC370 CMSLIB

One advantage of using the FILEDEF approach is that the FILEDEF may be concatenated, enabling you to bypass the limit of eight global libraries imposed by early versions of CMS. Another is that you can put OS partitioned data sets into the global list (in a non-XA system), as described in the following section. Note that when CMSLIB is concatenated, the global search order is the same as if CMSLIB were replaced in the global list by the files that compose it, in the order in which they were concatenated.

The special processing of the CMSLIB DDname is a feature of the SAS/C library; the DDname CMSLIB has no special significance to CMS.

Using an OS PDS as a global MACLIB/TXTLIB If your site permits CMS access to OS disks, the CMSLIB FILEDEF for use in a global list may reference an OS partitioned data set. The FILEDEF must have the following form:

 FILEDEF CMSLIB DISK filename MACLIB fm DSN OS-data-set-name TXTLIB

The filemode (fm) cannot be an asterisk (*) and must refer to an OS disk. The PDS referenced must have fixed-length blocked records with an LRECL of 80. The PDS must reside on a 3330, 3350, or 3380 disk device.

Note: You cannot use an OS PDS as a global MACLIB/TXTLIB in a XA-mode virtual machine.

Using the CMS Shared File System

VM/SP Release 6 introduced a new file system into CMS called the Shared File System (SFS). This file system provides new file management and sharing capabilities. SFS files are stored in a file pool (a collection of CMS minidisks) where users are given space to organize files into directories. Directories enable users to group related files together. By granting read or write authority to files or directories, users can allow other users to share their files. This feature enables several users to have access to the same file at the same time, although only one user can update a shared file at a time.

When a shared file is open for update, the file system provides update access to a copy of the file. Changes to the file do not take effect until the changes are committed. Alternately, after updating a file, the user can roll back the changes, which leaves the file unmodified. If a user opens a shared file for reading while another user is updating it, the reading user accesses a temporary copy of the data and can read only the data in the file at the time it was opened, even after the writing user commits changes.

Shared files can be accessed as if they were normal CMS disk files by using the CMS ACCESS command, which can assign a filemode letter to an SFS directory. Currently, use of unique SFS functionality, such as access to subdirectories and the ability to roll back changes, is not available with the CMS ACCESS command. These features are only available when the Shared File System is used directly.

The SAS/C Library allows access to the Shared File System directly and via the ACCESS command. If you use the ACCESS command to assign a file-mode letter to an SFS directory, files in the directory can be accessed using standard CMS pathnames. Alternately, a shared file can be processed directly by using an sf-style filename. For example, opening the following file accesses the file SUMMARY DATA in the directory of userid ACCTING named YR90.JUNE:

 sf:summary data accting.yr90.june

When a shared file is processed directly, it can be committed automatically as it is closed, or the file can be committed explicitly using the afflush function.

You can also process an SFS directory as if it were a file (for input only) by using an sfd-style filename. This enables you to retrieve various sorts of information about the files or subdirectories stored in the directory. The way in which information is returned is controlled by the dirsearch amparm.

SFS files can be processed with either the "seq" or "rel" access method, if the file attributes are appropriate. Except for trunc=yes (which is not allowed), all amparms that can be used with cms-style files can be used with an sf-style file. SFS directories can be opened only for input, and are always processed by the "seq" access method.

For more general information about using the Shared File System, see the IBM publication VM/ESA CMS User's Guide and other CMS documentation.

Naming shared files

The format of the name of a shared file, as specified to fopen, is

 sf:fileid dirid [filemode-number]

(You can omit the sf: prefix if the external variable _style has been set to define sf as the default style.)

Here fileid represents either a standard filename and filetype, or a namedef, which is an indirect reference to a filename and filetype created by the CMS CREATE NAMEDEF command.

Similarly, a dirid represents the following:

 [filepool]:[userid].[subdir1.[subdir2]...]
 namedef]

The filepool argument identifies a file pool; userid identifies a user; subdir1, subdir2, and so on name subdirectories of the user's top directory; and namedef is an indirect reference to a directory created by the CMS CREATE NAMEDEF command. Note that every dirid that is not a namedef contains at least one period. The simplest dirid is "." (which represents the current user's top directory in the default file pool).

Here are some examples of sf filenames and their interpretation:

sf: profile exec .: specifies the file PROFILE EXEC in the current user's top directory.
sf: updates amy.: specifies the file identified by the namedef updates in the top directory of user AMY.
sf: test data qadir 3: specifies the file named TEST DATA in the directory identified by the namedef qadir. The file has file mode 3; that is, it will be erased after it is read.
sf: graphix data altpool:.model.test: specifies the file named GRAPHIX DATA in the user's subdirectory MODEL.TEST in the file pool named ALTPOOL.

Note: There is no compressed (blankless) form for sf filenames.

Committing changes

When you open an sf-style file for update, you control when changes are committed. Two methods are provided to control when changes are committed: the afflush function and the commit amparm.

The afflush function is called to flush output buffers to disk with high reliability. For SFS files, a call to afflush causes a commit to take place, so that all changes to the file up to that point are permanently saved.

The commit amparm is used with sf-style files to specify whether changes will be committed when the file is closed. The default, commit=yes, specifies that when the file is closed, changes are committed. The alternative, commit=no, specifies that changes are not committed when the file is closed. When you open a file with commit=no, you must call afflush before closing the file if you want changes saved. On the other hand, if you want to roll back your changes, close the file without calling afflush, and no changes will be saved. You can call afflush as often as you want, with either commit=yes or commit=no; when you close a commit=no file, all changes since the last call to afflush are rolled back. See the afflush function description for an example of the use of commit=no.

Reading shared-file directories

To process a CMS Shared File System directory, you open an sfd-style pathname for input. The pathname specifies the directory to be read, and possibly a subset of the directory entries to be read. The format of the information read from the file, as well as which entries (files and subdirectories) are processed, is determined by the value of the dirsearch amparm when the file is opened.

The following values are accepted for dirsearch:

file: specifies that information is to be read for files in the directory. This option corresponds to the FILE option of the CMS DMSOPDIR routine.
all: specifies that information is to be read for files in the directory or its subdirectories. This option corresponds to the SEARCHALL option of the CMS DMSOPDIR routine.
allauth: specifies that information is to be read for files in the directory or its subdirectories to which the user is authorized. This option corresponds to the SEARCHAUTH option of the CMS DMSOPDIR routine.
subdir: specifies that information is to be read for subdirectories of the directory. This option corresponds to the DIR option of the CMS DMSOPDIR routine.

If you specify no value for dirsearch when you open a shared-file directory, dirsearch=allauth is assumed.

When you open a directory with dirsearch=file, dirsearch=all, or dirsearch=allauth, the pathname specifies both the directory that is to be read and a filename and filetype, possibly including wild-card characters, indicating from which directory entries are to be read.

The form of an sfd-style filename for these dirsearch values is

 sfd: fileid dirid [filemode-number]

If fileid has the form filename filetype, the filename, filetype, or both can be specified as *, indicating that the filename and/or filetype is not to be considered while reading the directory. If filemode-number is specified, only entries for files with the specified mode number are read.

Here are a few examples of sfd-style filenames for use with dirsearch=file, dirsearch=all, or dirsearch=allauth:

sfd: * * .: specifies that entries for all files in the user's top directory are to be read.
sfd: * exec devel: specifies that entries for all files with the filetype EXEC in the directory identified by the namedef devel are to be read.
sfd: * * mike.backup 2: specifies that entries for all files with filemode number 2 in the subdirectory BACKUP, belonging to the user MIKE, are to be read.

When you open a directory with dirsearch=subdir, the pathname specifies only the directory that is to be read. This format of sfd:-style filename is also used when you call remove, rename, access, cmsstat, or sfsstat for a Shared File directory.

Here are a few examples of sfd-style filenames to use with dirsearch=subdir:

sfd: .: specifies that entries for all subdirectories of the user's top directory are to be read.
sfd: master: specifies that entries for all subdirectories of the directory associated with the namedef master are to be read.

After you open a Shared File System directory, you read it as any other file. The data you read consist of a number of records, one for each matching file if dirsearch=subdir is not specified, and one for each subdirectory if dirsearch=subdir is specified. Mappings for the formats of these records are provided in the header file <cmsstat.h>. The following are more exact specifications:

dirsearch=file: specifies that the records are mapped by a portion of struct MAPALL. Only the first L_file bytes of this record are actually read.
dirsearch=all or dirsearch=allauth: specifies that the records are mapped by struct MAPALL. The number of bytes read is L_all.
dirsearch=subdir: specifies that the records are mapped by struct MAPDIR. The number of bytes read is L_dir.

These structures generally contain binary data. Therefore, if new-line characters appear in the data, you should open sfd files for binary rather than text access to avoid possible confusion. Also, none of the seeking functions fseek, ftell, fsetpos, or fgetpos are supported for Shared File directories.

Here is a simple example that opens a directory with dirsearch=file to print out the names of the files in the directory:

 #include <lcio.h>
 #include <cmsstat.h>

 struct MAPALL dir_data;

 main() {
    FILE *dir;
    int count = 0;
       /* Open top directory for input. */
    dir = afopen("sfd: * * .", "rb", "", "dirsearch=file");

    if (!dir) abort();
    for(;;) {
       fread(&dir_data, L_file, 1, dir);  /* Read one entry. */
       if (ferror(dir)) abort();
       if(feof(dir)) break;
       if (count == 0)     /* Write title before first line. */
          puts("Files in directory .:");
       ++count;
       printf("%8.8s %8.8s\n", dir_data.file.name,
       dir_data.file.type);
    }

    printf("\n%d files found.\n", count);
    fclose(dir);
    exit(0);
  }

Using VSAM Files

VSAM (Virtual Storage Access Method) is a file organization and access method oriented toward large collections of related data. Ordinary MVS and CMS files are organized as a sequential collection of characters, possibly grouped into records. To locate a particular record, it is necessary to read the entire file, until the appropriate record is found. VSAM files are organized according to keys that serve as record identifiers. This makes VSAM especially useful and efficient for many large-scale data processing applications.

For example, suppose you have a file containing a record for each employee in a company. If the data are stored in a normal MVS or CMS file, to update a single record you must read the entire file, until the correct record is found. Alternatively, the data can be stored in a VSAM file, using employee name or employee number as a key. With this organization, a program can immediately read and update any record, given the key value, without having to read the rest of the file.

VSAM files are described in more detail in the IBM publication MVS/DFP Using Data Sets. Consult this publication for a more complete description of VSAM files and services.

Kinds of VSAM files

Four kinds of VSAM files exist: KSDS, ESDS, RRDS, and LDS files. Following are the characteristics of these files:

KSDS (Key-Sequenced Data Set): is a VSAM file in which each record has a character key stored at some offset in each record. Each record must have a unique key. Records are stored and retrieved in key sequence. They can be added to or deleted from the data set in any order; that is, you can freely add new records between existing records. When records are modified, the length of the record can change, but the value of the key cannot be changed. Most VSAM files are KSDS files.
KSDS files can have alternate indices, which are auxiliary VSAM files that provide access to records by a key field other than the primary key. Access to records by an alternate index is accomplished by a path, which is an artificial data set name assigned to the combination of the base KSDS and the alternate index. An alternate index can be defined for a KSDS using a nonunique key field. For example, an employee file cannot have last name as its primary key because more than one employee may have the same last name. But a path to the employee file can use last name as its key, allowing quick access, for instance, to the records for all employees named Brown.
ESDS (Entry-Sequenced Data Set): is a VSAM file in which each record is identified by its offset from the start of the file. This offset is called a relative byte address (RBA). When a record is added to the file, it is added to the end, and the RBA of the new record is returned to the program. The RBA thus serves as a logical key for the record. ESDS data sets can have records of different lengths, but when a record is replaced, the length must not be changed. Also, records cannot be deleted.
Like KSDS files, ESDS files can have alternate indices that provide keyed access to the records of the ESDS. For instance, you can choose to make an employee file an ESDS file, with the records arranged by the order in which they were added to the file. You can still access records in this file using last names by building an alternate index with last name as the key over the ESDS. The same rules apply when you use a path to access an ESDS as when you access the ESDS directly; that is, you cannot change the length of a record or delete a record. Except for these restrictions, a path over an ESDS is treated as a KSDS because records accessed through the path are always arranged in the order of the alternate keys.
RRDS (Relative-Record Data Set): is a VSAM file in which each record is identified by record number. Records can be added or deleted in any order, and the file can have holes; that is, it is not necessary that all possible record numbers be defined. Records in an RRDS file are normally all the same length; SAS/C does not support RRDS files with variable-length records. Alternate indices over an RRDS are not supported.
LDS (Linear-Data Set): is a VSAM file consisting solely of a sequence of characters divided into 4096-byte pages. Unlike other VSAM files, LDS files are not keyed, and normally are accessed via the supervisor DIV (Data In Virtual) service, as described in the IBM publication MVS/ESA Application Development Guide. LDS files are supported only under MVS.

Access to VSAM files using standard I/O

You can access all kinds of VSAM files using the standard C I/O functions defined by <stdio.h>. Because of the special characteristics of VSAM files, there are restrictions for some types of VSAM files:

Access to a KSDS file via standard I/O must be for input only. Output cannot be supported, because records are written in key order, and the C standard specifies that new data can be written only to the end of a file. That is, it is not possible, using standard I/O, to insert characters between existing characters of an old file. When you read a KSDS file using standard I/O, the records are always retrieved in ascending key order.
Access to an ESDS file via standard I/O is fully supported, in both text and binary mode.
Access to an RRDS file via standard I/O is supported only in binary mode. An RRDS file is processed using the "rel" access method. This means that you can use the fseek and ftell functions, which provides compatibility with file behavior on UNIX operating systems.
Access to an LDS file via standard I/O is supported only in binary mode. The library uses the DIV macro to access the file rather than reading or writing it directly. An LDS file is processed using the "rel" access method. This means that the fseek and ftell functions can be used, which provides compatibility with file behavior on UNIX operating systems. See Advanced MVS I/O Facilities for more information on accessing LDS files.

Keyed access to VSAM files

Because standard C I/O assumes that files are simply sequences of characters stored at unchanging offsets in the file, standard C is not suited to exploiting VSAM capabilities. For this reason, the SAS/C library provides a keyed-access mode for VSAM files designed to take advantage of their unique properties. Keyed access is an alternative to text or binary access, specified by the open mode argument to the fopen or afopen function, as shown in the following statements:

 ftext = fopen("ddn:ESDS", "r+");   /* Open ESDS for text access.   */
 fbin  = fopen("ddn:ESDS", "r+b");  /* Open ESDS for binary access. */
 fkey  = fopen("ddn:ESDS", "r+k");  /* Open ESDS for keyed access.  */

Only a subset of the standard I/O functions (shown in the following list) are available for files opened for keyed access; that is, this list shows the functions that make sense for such files.

afflush: flush output buffers, synchronize file access
clearerr: reset previous error condition
clrerr: reset previous error condition
fattr: return file attributes
fclose: close file
feof: test for end of file
ferror: test for error
ffixed: test for fixed-length records
fnm: return filename
fterm: test whether file is the terminal
setbuf: change buffering (null operation)
setvbuf: change buffering (null operation).

The following keyed-access functions are supported only for VSAM files. These functions are described in more detail later in this section:

kdelete: delete the last record retrieved
kgetpos: return position of current record
kinsert: add a new record
kreplace: replace the last record retrieved
kretrv: retrieve a record
ksearch: locate a record
kseek: reposition keyed file
ktell: return RBA of current record.

Keyed access is not supported for VSAM LDS files because these files are not divided into records and have no keys.

Records and keys

Keyed-access functions for VSAM process one record at a time, rather than one character at a time. Most functions have arguments that are pointers to records or pointers to keys. Because C is not, in general, a record-oriented language, you need to be careful when defining data structures for use with VSAM.

Many VSAM files have fixed-length records, where all records have the same format. These files are easy to process in C, because the record can be represented simply as a structure, as shown in this simple example:

 struct {
    char name[20];
    char address[50];
    short age;
 }  employee;

 kretrv(&employee, NULL, 0, f);

This example reads records of a single type from a VSAM file. More complicated files may have records with different lengths or types; C unions can be helpful in processing such files:

 struct personal {
    char name[20];
    char rectype;     /* = P for personal */
    char address[50];
    short age;
 } ;
 struct job {
    char name[20];
    char rectype;          /* = J for job */
    long salary;
    short year_hired;
 }
 union {
    struct personal p_rec;
    struct job j_rec;
 }  employee;

 kretrv(&employee, NULL, 0, f);

The call to the kretrv function can read a record of either type; then the rectype field can be tested to determine which type of record was read. Here is an example showing the replacement of a record with several types:

 if(employee.p_rec.rectype == 'P')
    recsize = sizeof(struct personal);
 else recsize = sizeof(struct job);
 kreplace(&employee, recsize, 0, f);

If the length were specified as sizeof(employee) and the record were a job record, more data would be written than defined in the record, and file space is wasted.

Any characters can occur in a record. Neither new-line characters ('\n') nor null characters ('\0') have any significance.

For a KSDS file, a record key is always a fixed-length field of the record. Any characters can appear in a key, including the new-line character ('\n') and the null character ('\0'). Also, the key is not restricted to being a character type; for some files, the key might be a long, a double, or even a structure type.

When you have character keys, you should be sure to specify all characters of the key. For instance, consider the following call to the ksearch function, intended to retrieve the record whose key is Brown in the employee file described previously:

 ksearch("Brown", 0, K_exact, f);

This search will probably fail, because the key length for this file is 20 characters. The ksearch function looks for a record whose key is the characters "Brown", followed by a null character, followed by 14 random characters (whatever comes after the string "Brown" in memory), and probably will not find one.

Usually, strings in VSAM files are padded with blanks, so the following example shows the correct usage:

 char key[20];
 memcpyp(key, "Brown", 20, 5, ' ');  /* Copy key and blank pad. */ -->
 ksearch(key, 0, K_exact, f);

ESDS and RRDS files do not have physically recorded keys. However, the RBA (for an ESDS) and the record number (for an RRDS) serve as logical keys for these files. The structures representing these records in a C program must include an unsigned int or unsigned long field at the start of the record to hold the key value. This key is not actually recorded in the file. In this example, record 46 is inserted into an RRDS:

 struct {
    unsigned long recno;
    char name[20];
    char address[60];
 }  rrds_rec;

 rrds_rec.recno = 46;
 kinsert(&rrds_rec, sizeof(rrds_rec), NULL, rrds);

The key (46) is specified in the first 4 bytes of the record. Note that the key is not actually stored in the file. The size of the C record is 84 characters, but the length of the record in the VSAM file is 80 characters because the key is not physically recorded.

Rules for keyed access

The keyed-access functions are record-oriented rather than character-oriented. When a keyed file is used, it can be in one of the following states:

Immediately after the file is opened, there is no current record defined. This means that functions such as kdelete and kreplace, which affect the current record, are not allowed at this point. After successful use of the kinsert, kreplace, kdelete, ksearch, or kseek, the file is also in this state.
After successful use of kretrv, the retrieved record becomes current. This means that the record can be updated or deleted and that its address can be obtained with the ktell or kgetpos functions. The current record can either be held for update or not held. The record is not held if the file is opened for input only or if K_noupdate was specified as an argument to the kretrv call. Otherwise, the record is held. Replacement or deletion of a record is allowed only if it is held for update. Additionally, some other VSAM processing is different, depending on whether the current record is held. For more information, see VSAM pitfalls .
After an error in processing, the file is in an error state. You need to call the clearerr function to continue use of the file. In many cases after an error, the file position is undefined, and you have to use ksearch or kseek to reposition the file before continuing.

Unlike some other kinds of files, your program can open the same VSAM file more than once. The same file can be opened any number of times, either using the same filename, or using different names. A file can also be opened via several paths. The open modes do not need to be the same in this case. For example, one stream can access the file for input only, and another can access to append. However, each opening of the file must specify keyed access; that is, standard I/O and keyed I/O to the same file cannot be mixed.

When several streams access the same VSAM file, only one of them can hold a particular record for update. If one stream attempts to retrieve a record held by another stream and K_noupdate is not specified, retrieval will fail. Because of the way that VSAM holds records for update, it is also possible for two streams accessing the same file to interfere with each other's processing of different records. See VSAM pitfalls for more information.

Using a KSDS

The following are considerations unique to processing a KSDS:

Records can be retrieved in either ascending or descending key order.
KSDS files support many kinds of searches. You can search for a record with a particular key, a record with a matching partial (or generic) key, or a record with a key greater than or equal to a particular value. You can search either forward or backward to optimize retrieval in the chosen direction. (A backward search for an inexact key finds the last record whose key is less than or equal to the specified value.) Backward searches are restricted in paths that allow duplicate keys. See the ksearch function description for further details.
Records can be added to a KSDS at any point in the file. After a record is added, the file is positioned immediately following the new record.
Records can be deleted from a KSDS. After a record is deleted, the file is positioned to the record following the deleted record. Records cannot be deleted from a path whose base is an ESDS.

Using an ESDS

The following are considerations unique to processing an ESDS:

ESDS files have no physical keys. A record's RBA is used as a key by the C library routines. The areas used for input or output of ESDS records must include 4 bytes at the start for the record's RBA, in addition to the data stored in the file.
Records can be retrieved in either ascending or descending RBA order.
Only exact searches are allowed; that is, you can locate a record with a specific RBA, but it is not possible to search for a record whose RBA is greater than or less than a specific value.
New records are always inserted at the end of file in an ESDS. The RBA of the new record is optionally stored by kinsert and is not generally predictable before the record is inserted.
Deletion of records from an ESDS is not supported. Similarly, it is not permitted to change the length of an existing record.

Using an RRDS

The following considerations are unique to processing an RRDS:

RRDS files have no physical keys. A record's record number is used as a key by the C library routines. The areas used for input or output of RRDS records must include 4 bytes at the start for the record number, besides the data stored in the file.
Records can be retrieved in either ascending or descending record number order. Records that are not defined are skipped; that is, if records 1 and 3 have been stored, but record 2 is not defined (or has been deleted), record 3 will be retrieved after record 1, and no error will occur due to the missing record.
An RRDS does not support partial or generic key searches. However, you can search for the first record whose number is greater than or less than a specific value.
New records can be inserted anywhere in an RRDS where a record does not exist already. After a record is inserted, the file is positioned to the record number following the insertion.
Records can be deleted from an RRDS. After a deletion, the file is positioned to the record number after the deleted record.
SAS/C does not support RRDS data sets that contain varying-length records.

Using an alternate path

The following are considerations unique to processing an alternate path. An alternate path to a KSDS or an ESDS is treated, in general, as a KSDS, except that some operations are forbidden for a path whose base is an ESDS:

Records can be retrieved in either ascending or descending key order. If the file is a path with duplicate keys, records with the same key are retrieved in the same order that they were added to the file, whether retrieval is forward or backward. When processing a path with duplicate keys, you cannot switch between backward and forward retrieval without performing a search.
In general, paths support the same kinds of searches as KSDS files. You can search for a record with a particular key, a record with a matching partial (or generic) key, or a record with a key greater than or equal to a particular value. You can search either forward or backward to optimize retrieval in the chosen direction. (A backward search for an inexact key finds the last record whose key is less than or equal to the specified value.) Backward searches are restricted in paths that allow duplicate keys. See the ksearch function description for further details.
Records can be added to a path at any point in the file. After a record is added, the file is positioned immediately following the new record. Even for a path that allows duplicate keys, you are not permitted to add a record with the same primary key as an existing record.
Records can be replaced via an alternate path. Neither the primary key nor the alternate key can be changed. If the base of the path is an ESDS, the record length cannot be changed.
Records can be deleted from a path whose base is a KSDS. After a record is deleted, the file is positioned to the record following the deleted record. Records cannot be deleted from a path whose base is an ESDS.

VSAM pitfalls

The VSAM access method is highly optimized for performance when processing large files. Sometimes this optimization can produce incorrect results for programs that process the same VSAM data set using more than one FILE or more than one path. This section describes some of these situations and suggests how to circumvent them. These pitfalls apply only to programs that access the same file in several ways. Simple VSAM programs that open each VSAM file only once are not affected.

Note: Sharing of VSAM files between users through SHAREOPTIONS(3) or the SHAREOPTIONS(4) Access Method Services (AMS) option is not supported by the SAS/C library. If you ignore this restriction, lost records, duplicate records, or file damage may occur.

When the same file is accessed through several paths, a problem can occur when VSAM attempts to avoid reading a record from disk because a copy exists in memory. If a request is made to read a record and the record is not to be updated, VSAM may return an obsolete copy of the record from memory to the program, rather than reading a current copy from disk. If this is a problem for your application, you can always retrieve records for update by not specifying K_noupdate, regardless of whether you intend to update them. This ensures that you get the most recent version of a record at the cost of additional disk I/O. But note that you cannot retrieve a record for update if you open the file with open mode 'rk'. Alternately, you can use the afflush function to flush all buffers in memory. After using afflush, retrievals access the most recent data from disk, because there is no copy in memory.

Many programs cannot be affected by this problem. For example, if your program processes records in key order, it probably does not ever attempt to retrieve a record after it has been updated. For such a program, there is no need to avoid the use of K_noupdate on retrieval.

Another potential problem has to do with the way that VSAM stores records. Records in VSAM files are organized into control intervals of a fixed size. Each disk access consists of the reading or writing of an entire control interval. When VSAM is said to be holding a record, the control interval is actually what is held. This means that an attempt to read a record for update using one stream may fail, because another record in the same control interval is held by another stream. In general, when this may occur cannot be predicted, although records whose keys are close to each other are more prone to this condition.

This condition can be recognized by the setting of the value EINUSE for the errno variable. Resolving the condition is more difficult. It is possible to release a hold on a record without updating the record by the call kseek(fp, SEEK_CUR). This call does not change the file position, which means that kretrv can be used to retrieve the next record, as if kseek had never been called. Also, sometimes you may be able to organize your program so that it normally retrieves records using K_noupdate and, if the program decides that the record should be modified, retrieves the record a second time for update immediately before replacing or deleting it.

VSAM-related amparms

For some VSAM files, processing performance can be improved by allocating storage for additional I/O buffers. SAS/C allows you to specify buffer allocation for VSAM files using the following amparms. (Note that the order amparm can also have an effect on performance.

The bufnd amparm specifies the number of I/O buffers VSAM is to use for the data records. This option is meaningful only for VSAM files and is equivalent to coding a BUFND value on an ACB assembler macro used to open the VSAM file. A data buffer is the size of a control interval in the data component of a VSAM cluster. For keyed access and order=random, the bufnd default is 2. For standard I/O VSAM access or keyed access with order=seq, the bufnd default is 4. Generally, with sequential access, the optimum value for the data buffers is six tracks, or the size of the control area, whichever is less. For skip-sequential processing, specifying two tracks for the data buffers is a good starting point. Specification of a bufnd value larger than the default generally yields performance improvements for applications that primarily do sequential input processing of the VSAM file or initial loading (sequential writes after first open) for a VSAM file opened with keyed access. In other situations, specifying a larger bufnd value may yield no improvement, or may actually degrade performance by tying up large amounts of virtual storage and causing excessive paging.
The bufni amparm specifies the number of I/O buffers VSAM is to use for index records. This option is meaningful only for VSAM KSDS files and is equivalent to coding a BUFNI value on an ACB assembler macro used to open the VSAM file. An index buffer is the size of a control interval in the index component of a keyed VSAM cluster. For keyed access (random or skip sequential), bufni defaults to 4, and for text or binary access (mainly sequential), it defaults to 1. For keyed access other than initial load, the optimum bufni specification is the number of high-level (nonsequence set) index buffers + 1. You can determine this number by subtracting the number of data control areas from the total number of index control intervals within the dataset. You can use an upperbound bufni specification of 32, which accommodates most VSAM files with reasonable index control interval and data control area sizes (cylinder allocated data component) up to the four-gigabyte maximum data component size allowed. Large bufni specifications incur little or no performance penalty, unless they are excessive.
The bufsp amparm specifies the maximum number of bytes of storage to be used by VSAM for all I/O buffers. This option is meaningful only for VSAM files and is equivalent to coding a BUFSP value on an ACB assembler macro used to open the file. A data or index buffer is the size of a control interval in the data or index component. For a valid bufsp specification (minimum of one index and two data buffers), VSAM allocates data and index buffers as follows:
- For order=seq amparm, initial keyed access load, or text or binary access, one or two index buffers are allocated, and the remaining bytes are allocated to data buffers.
- For keyed access and order=random, two data buffers are allocated, and the remaining bytes are used for index buffers.

A valid bufsp specification generally overrides any bufnd or bufni specification. However, the VSAM rules for doing this are fairly complex, and you should consult the appropriate IBM VSAM Macro Reference manual for your system for more information on the ACB macro BUFSP option.

VSAM I/O Example

This example consists of three pieces of code. The first piece is the SAS/C VSAM example program; the second piece is the JCL used to create, load, and update the VSAM file; and the third piece is the JCL used to compile and link the VSAM example.

The SAS/C VSAM example program, KSDS, demonstrates how to load, update, search, retrieve, and delete records from a KSDS VSAM file. Two VSAM files are used:

ddn:ITEM: the VSAM file being used
ddn:DATA: where records for initially loading the VSAM file are stored

Two data files are used:

ddn:UPDATE: contains the records for loading and updating
ddn:DELETE: contains the keys for the records being deleted

  #include <stdio.h>
  #include <lcio.h>
  #include <fcntl.h>

  #define BUFSIZE 80
  #define VBUFSIZE 50
  #define KEYSIZE 19

  void loadit(void);           /* Load a VSAM file.                  */
  void update(void);           /* Update a VSAM file.                */
  void printfil(void);         /* Print a VSAM file.                 */
  void add_rep(void);          /* Add or update specific records.    */
  void del_rec(void);          /* Delete specific records.           */

  FILE *vfptr1,                /* ptr to the VSAM file               */
       *vfptr2,                /* another ptr to the VSAM file       */
       *fptr;                  /* ptr to the DATA file               */

  char buffer[BUFSIZE+1];      /* buffer for reading input file data */
  char vbuffer[VBUFSIZE+1];    /* VSAM record buffer                 */
  char key[KEYSIZE+1];         /* key field for VSAM record          */

  main()
  {
     unsigned long offset;
        /* If VSAM file has been loaded, update.  Otherwise LOAD.    */
     puts("Has the VSAM file been loaded?");
     quiet(1);
        /* Attempt to open the file r+k.  If that works, then it     */
        /*  is loaded.  The open could fail for reasons other than   */
        /*  that the file has not yet been loaded.  In this case,    */
        /*  loadit will also fail, and a library diagnostic will     */
        /*  be printed.                                              */
     vfptr1 = afopen("ddn:ITEM", "r+k", "", "keylen=19, keyoff=0");
     quiet(0);

     if (!vfptr1){
        puts("File has not been loaded. Load it.");
        loadit();
        }
     else{
        puts("File has been loaded. Update it.");
        update();
        }

        /* Show the current state of the VSAM file. */
     printfil();

        /* Est. 2nd ptr to VSAM file.  */
        /*Search and print specific records. */
     if ((vfptr2 = afopen("ddn:ITEM", "r+k", "", "keylen=19, keyoff=0"))
        == NULL){
        puts("Could not open VSAM file a 2nd time");
        exit(99);
        }
        /* Search for some specific records by key. */
     puts("nnDo some searching");
     memcpy(key, "CHEMICAL FUEL      ", KEYSIZE);
     key[KEYSIZE]='0';       /* Terminate the key. */
     printf("Search for %sn", key);
     ksearch(key, 0, K_noupdate | K_exact, vfptr1);
     memcpy(key, "HOUSEHOLD PAN      ", KEYSIZE);
     key[KEYSIZE]='0';       /* Terminate the key. */
     printf("Now Search for %sn", key);
     ksearch(key, 0, K_noupdate | K_exact, vfptr2);

        /* Retrieve the records found. */
     puts("nnOK, now retrieve the records that we found");
     kretrv(vbuffer, NULL, K_noupdate, vfptr1);
     vbuffer[VBUFSIZE]='0';
     puts(vbuffer);
     kretrv(vbuffer, NULL, K_noupdate, vfptr2);
     vbuffer[VBUFSIZE]='0';
     puts(vbuffer);

     fclose(vfptr2);

        /* Find the first and last records in the file and their RBA. */
     kseek(vfptr1, SEEK_SET);
     kretrv(vbuffer, NULL, K_noupdate, vfptr1);
     vbuffer[VBUFSIZE]='0';
     printf("nThe first record in the file is:n%sn", vbuffer);
     offset = ktell(vfptr1);
     printf("Its RBA is: %lun", offset);
     kseek(vfptr1, SEEK_END);
     kretrv(vbuffer, NULL, K_backwards | K_noupdate, vfptr1);
     vbuffer[VBUFSIZE]='0';
     printf("nThe last record in the file is:n%sn", vbuffer);
     offset = ktell(vfptr1);
     printf("Its RBA is: %lun", offset);
  }

     /* This is the loadit function, which does the initial */
     /*  loading of the VSAM file.                          */
  void loadit()
  {

     puts("Loading the VSAM file");
     if ((fptr = fopen("ddn:DATA", "rb")) == NULL){
        puts("Input file could not be opened");
        return;
     }

        /* We must attempt to open the file again.  Since we are here, */
        /*  we know that the first attempt failed, probably because    */
        /*  the file was empty.                                        */

     if ((vfptr1 = afopen("ddn:ITEM", "a+k", "", "keylen=19, keyoff=0"))
        == NULL){
        puts("VSAM file could not be opened");
        return;
     }

     while (afread(buffer, 1, BUFSIZE, fptr)){
        kinsert(buffer, VBUFSIZE, NULL, vfptr1);
     }
  }


  /* The following function updates the VSAM file by calling functions */
      to add, replace, and delete records.                             */

  void update()
  {
     puts("Updating the VSAM file");
     printfil();
     add_rep();
     del_rec();
  }


     /* The add_rep function updates the VSAM file by adding or  */
     /*  replacing records specified in DDN:UPDATE.              */

  void add_rep()
  {

     puts("nUpdating specified VSAM records");
     if ((fptr = fopen("ddn:UPDATE", "rb")) == NULL){
        puts("Update file could not be opened");
        return;
     }
     puts("n");

        /* Search VSAM file for records whose keys match those in */
        /*  UPDATE file. If a match is found, update record.      */
        /*  Otherwise, add record.                                */

     kseek(vfptr1, SEEK_SET);
     while (afread(buffer, 1, BUFSIZE, fptr)){
        memcpy(key, buffer, KEYSIZE);
        if ((ksearch(key, 0, K_exact, vfptr1)) > 0){
          kretrv(vbuffer, NULL, 0, vfptr1);
          vbuffer[VBUFSIZE]='0';
          printf("Replace the record:n%sn", vbuffer);
          memcpy(vbuffer, buffer, VBUFSIZE);
          vbuffer[VBUFSIZE]='0';
          printf("With:n%sn", vbuffer);
          kreplace(vbuffer, VBUFSIZE, vfptr1);
        }
        else{
          memcpy(vbuffer, buffer, VBUFSIZE);
          vbuffer[VBUFSIZE]='0';
          printf("Can't find this record, so we'll add it:n%sn",vbuffer);
          kinsert(vbuffer, VBUFSIZE, NULL, vfptr1);
        }
     }
     fclose(fptr);
  }

     /* The del_rec function deletes VSAM records               */
     /* with specified keys.                                    */                                                   |

  void del_rec()
  {
     puts("nDeleting specified VSAM records");
     if ((fptr = fopen("ddn:DELETE", "rb")) == NULL){
        puts("Delete file could not be opened");
        return;
     }

        /* Search VSAM file for records whose keys match those in */
        /*  DELETE file. If a match is found, delete record.      */
        /*  Otherwise, issue a message that no match was found.   */                             |

     kseek(vfptr1, SEEK_SET);
     while (afread(buffer, 1, BUFSIZE, fptr)){
        memcpy(key, buffer, KEYSIZE);
        key[KEYSIZE]='0';
        if ((ksearch(key, 0, K_exact, vfptr1)) > 0){
          kretrv(vbuffer, NULL, 0, vfptr1);
          vbuffer[VBUFSIZE]='0';
          printf("Delete the record:n%sn", vbuffer);
          kdelete(NULL, vfptr1);
        }
        else
          printf("Couldn't find a record with the key: %sn", key);
     }
     fclose(fptr);
  }


     /* The function printfil prints the contents of the VSAM file. */
                                |
  void printfil()
  {
     int i=0;

     puts("nnHere is the current state of the VSAM file");
     puts("   ITEM             QTY SZ BIN DESC    COMMENTS  ");
     kseek(vfptr1, SEEK_SET);
     while ((kretrv(buffer, NULL, K_noupdate, vfptr1)!=-1) &&
           !feof(vfptr1)){
        buffer(|VBUFSIZE|) = '0';
        printf("%d %sn", i, buffer);
        i++;
     }
  }

The JCL file KSDSGO creates a KSDS VSAM file, loads it using the above program (KSDS), and then updates it using KSDS.

Note: The DEFINEIT step will fail the first time if there is no VSAM file to delete. You must either comment out the DELETE step the first time or create a dummy VSAM file that can be deleted before this job is run:

  //*------------------------------------------------------------------
  //* DEFINE THE VSAM KSDS
  //*------------------------------------------------------------------
  //DEFINEIT EXEC PGM=IDCAMS
  //SYSPRINT DD SYSOUT=A
  //SYSIN DD *
   DELETE (yourid.ksds.vsamfile) PURGE CLUSTER
   DEFINE CLUSTER (NAME(yourid.ksds.vsamfile) INDEXED VOLUMES(YOURVOL) -
                TRACKS(2 2) KEYS(19 0) RECORDSIZE(50 100)             -
                FREESPACE(0 0) CISZ(512) )                            -
         DATA  (NAME(yourid.ksds.vsamfile.DATA))                      -
         INDEX (NAME(yourid.ksds.vsamfile.INDEX))
   LISTCAT ENTRIES(yourid.ksds.vsamfile) ALL
  /*
  //*------------------------------------------------------------------
  //* LOAD THE VSAM KSDS USING SAS/C
  //*------------------------------------------------------------------
  //LOADIT EXEC PGM=KSDS
  //STEPLIB DD DSN=your.load.dataset,DISP=SHR
  //        DD DSN=sasc.transient.library,DISP=SHR
  //SYSPRINT DD SYSOUT=A
  //SYSTERM  DD SYSOUT=A
  //SYSUDUMP DD SYSOUT=A
  //ITEM     DD DSN=yourid.ksds.vsamfile,DISP=SHR
  //DATA     DD  *
  AUTO STARTER       99 02 92  WARRANTY  MODERATE
  CHEMICAL ACID      05 04 00  PH10      MODERATE
  CHEMICAL EAAAAA    55 75 50  ALCOHOL   CHEAP
  CHEMICAL FUEL      45 77 80  DIESEL    EXPENSIVE
  CHEMICAL GAS       10 30 50  LEADED    CHEAP
  HOUSEHOLD PAN      03 10 33  METAL     CHEAP
  /*
  //*
  //*------------------------------------------------------------------
  //* OBTAIN AN IDCAMS LISTCAT
  //*------------------------------------------------------------------
  //IDCAMS3  EXEC PGM=IDCAMS
  //SYSPRINT DD SYSOUT=A
  //SYSIN DD *
    LISTCAT ENTRIES(yourid.ksds.vsamfile) ALL
  /*
  //*
  //*------------------------------------------------------------------
  //* ADD/UPDATE/DELETE RECORDS TO THE VSAM KSDS
  //*------------------------------------------------------------------
  //UPDATE EXEC PGM=KSDS
  //STEPLIB DD DSN=your.load.dataset,DISP=SHR
  //        DD DSN=sasc.transient.library,DISP=SHR
  //SYSPRINT DD SYSOUT=A
  //SYSTERM  DD SYSOUT=A
  //SYSUDUMP DD SYSOUT=A
  //ITEM     DD DSN=yourid.ksds.vsamfile,DISP=SHR
  //UPDATE   DD  *           LIST OF RECORDS TO BE ADDED OR UPDATED
  CHEMICAL FUEL      45 77 80  GASOLINE  CHEAP
  CHEMICAL FUELS     55 67 81  PROPANE   CHEAP
  /*
  //DELETE   DD  *           LIST OF KEYS OF RECORDS TO BE DELETED
  AUTO STARTER
  BOGUS ID
  /*
  //

The JCL file called KSDSCL compiles and links KSDS.

  //*------------------------------------------------------------------
  //* COMPILE AND LINK THE PROGRAM
  //*------------------------------------------------------------------
  //CL EXEC LC370CL
  //C.SYSLIN DD DSN=your.obj.dataset(KSDS),DISP=OLD
  //C.SYSIN  DD DSN=your.source.dataset(KSDS),DISP=SHR
  //LKED.SYSLIN DD DSN=your.obj.dataset(KSDS),DISP=OLD
  //LKED.SYSLMOD DD DSN=your.load.dataset(KSDS),DISP=SHR
  //

SAS/C I/O Questions and Answers

The following are frequently asked questions about SAS/C I/O:

Flushing output to disk

Q. My program runs for days at a time. I want to be sure that all data I write to my files are actually stored on disk in case the system fails while the program is running. fflush does not seem to guarantee this. What can I do?

A. It is true that using fflush on a file does not guarantee that output data are immediately written to disk. For a file accessed as a binary stream, fflush passes the current output buffer to the system I/O routines, but there is no guarantee that the data are immediately transmitted. (For instance, under MVS the data are not written until a complete block of data has been accumulated.) This situation is not limited to MVS and CMS. For instance, fflush under most versions of UNIX simply transfers the data to a system data buffer, and there is no guarantee that the data are immediately written to disk.

Even after output data are physically written to disk, they may be inaccessible after a system failure. This is due to the details of the way that disk space is managed under MVS and CMS. For instance, under CMS, the master file directory for a minidisk maps the blocks of data associated with each file. If a program adds new data blocks to a file, but the system fails before the master file directory is updated, the new blocks are inaccessible. CMS only updates the master file directory when a command terminates, or when the last open file on the disk is closed by CMS. A similar problem occurs under MVS with the end-of-file pointer in the data set label, which is updated only when the file is closed.

The recommended solution to this problem is the nonportable SAS/C function afflush. This function is similar to fflush but guarantees that all data has been written to disk and that the master file directory or VTOC has been updated to contain current information. For programs using UNIX style I/O, the function fsync can be used in the same way.

Comparing C standard I/O to other languages' I/O

Q.I have compared a small C program with a small COBOL program. Both programs simply copy records from one file to another. Why is C standard I/O so much slower than COBOL, and is there anything I can do about it?

A.A simple COBOL file copy and a simple C file copy are not comparable. COBOL I/O can be implemented much more efficiently on the 370 than C standard I/O because the COBOL I/O model is a record I/O model, corresponding closely to the I/O model implemented by the operating system. COBOL clauses such as "RECORD CONTAINS 80 CHARACTERS" and "BLOCK CONTAINS 40 RECORDS" allow the compiler to generate code that invokes the operating system's access methods directly.

C standard I/O, on the other hand, is stream-oriented. The C library cannot call the access method directly because of the need to support text file translation and complicated interfaces like ungetc, fflush, and fseek. Even if the program does not use these capabilities, the library must still support them. In addition, because of the requirement that compiled code be system independent, and because the meanings of attributes like reclen and blksize differ from MVS to CMS, C I/O cannot be optimized based on knowledge of these attributes, as COBOL can.

The SAS/C OS and CMS low-level I/O functions enable you to access files at the same low level used by COBOL. When you use low-level I/O in C, you will find performance similar to that of COBOL. If you do not want to use low-level I/O, then refer to the next question in this section.

Note: The discussion here applies when you compare C I/O to other languages such as PL/I. However, the difference is not as great because these languages also compare unfavorably to COBOL, and for the same reason: their I/O models are not as close to the 370 model as COBOL's, although they are closer than the C model.

Efficient I/O

Q. What can I do to improve the performance of I/O?

A. Here are some recommendations for improving the performance of I/O. Each recommendation should be evaluated individually because many are not relevant to every application:

Avoid UNIX style I/O if possible, except when using the OpenEdition hierarchical file system. If you need UNIX I/O properties, such as byte-addressability, use files suitable for "rel" access to avoid the overhead of copying the data to a temporary file.
Use binary rather than text access, if you have a choice. When you read or write a file as a text stream, every character read or written must be inspected to see if it is a new-line character and, therefore, requires special treatment. No such tests are necessary with binary access. If your application uses fgets or fputs to process data one line at a time and it is not required to be portable, investigate whether it could be changed to use afread or afwrite instead.
Avoid doing I/O one character at a time. Especially avoid the fgetc and fputc functions. The C standard requires that these functions generate an actual function call, which introduces a substantial overhead for each character read or written. If you must do I/O one character at a time, use getc and putc, which generate inline code and cause a subroutine call only when necessary to read or write a new buffer. For debugged programs that use getc and putc, you can #define the symbol _FASTIO where appropriate. This increases the speed of getc and putc by removing checks for invalid FILE pointers.
Read or write an entire record at a time using the multiple item feature of fread and fwrite or afread and afwrite. Reading less than one record at a time increases the number of subroutine calls required and, therefore, decreases performance. Reading more than one record at a time is not harmful, but it is not particularly beneficial, because data are buffered within the library one record at a time.
Use a large block size. Data are transferred to and from the disk in blocks, so increasing the block size decreases the number of I/O operations and subroutine calls required to read a given amount of data.
Use VIO for temporary files under MVS. VIO uses system paging to manage the data and is substantially more efficient than a real temporary data set.
Consider alternatives to printf. printf is one of the most expensive routines at run time, because of the need to interpret the format string. In many cases, you can use a faster routine to do the same thing. For instance, use puts( str ) instead of printf("%s\n", str).
Put the transient library into MVS LPALIB or a VM shared segment. This has two advantages: it cuts down the overhead associated with dynamically loading I/O routines, and it decreases system paging because all C programs can share the same copy of the library routines.

Processing character control characters as data

Q. How can I process the carriage control characters as data in a file defined as RECFM=xxA (where xx is F, FB, V, or VB)?

A. Process the file using binary I/O. When you process a record format A file with text I/O, the library manages the carriage control for you, so it can correctly handle C control characters like '\f' and '\r'. But when you use binary I/O, the library leaves the data alone, so you can process the file as you see fit.

Processing SMF records

Q. How can I best process SMF records using C I/O?

A. The library functions afreadh and afread are well-suited to reading SMF records. Simple SMF records consist of a header portion containing data items common to all records, including a record type, followed by data whose format is record-type dependent. Complex SMF records may contain subrecords that occur a varying number of times. For instance, a type 30 record contains an I/O subrecord (or section) for each DD statement opened by a job. To process a simple SMF record in C, use afreadh to read the common record header, and then use afread to read the remainder. The length specified for the call to afread is the largest length possible for the record type. (afread returns the amount of data actually read.)

To process a complex SMF record in C, use afreadh to read each section of the record, using information from previous sections to allow you to map the record. For instance, if the record header indicates you are reading a type 30 record, then you would call afreadh again to read the common header for a type 30 record. This header may indicate you have three sections of type A and two of type B. You can then call afreadh three times to read the A sections, and two more times to read the B sections.

Note: Using afread to read any of the nonheader information is not necessary.

Compatibility between MVS and CMS

Q. What can I do to make my I/O portable between MVS and CMS?

A. You hardly need to do anything special at all. The main source of I/O incompatibility between MVS and CMS is filename syntax. By default, filenames are interpreted as DDnames under MVS, but as CMS disk filenames under CMS. Furthermore, CMS and MVS naming conventions are different. Here are three possible strategies for solving this particular problem (there may be more):

Use DDnames as filenames under both MVS and CMS, so that CMS users have to use the FILEDEF command to define DDnames before running your program. You may use an EXEC to call the program under CMS, in which case the FILEDEFs can be issued by the EXEC.
Use tso or cms style filenames under both systems, and only use names that are acceptable under both systems. For example, use tso:config.data, not tso:config.user.data, which is unacceptable under CMS. Note that this may limit your program to being used from TSO under MVS.
Use the sysname and envname functions to determine at run time whether you are running under MVS or CMS and choose filenames accordingly. This is the most flexible solution because you can choose the filenames most appropriate for each system independently.

After you have solved the filename problem, you will find that your I/O applications move effortlessly between CMS and MVS.

File creation

Q. I call the creat function, followed by close, to create a file without putting any data in it. But when I open it read-only later, I get a message that the file doesn't exist. What is wrong?

A. You attempted to create an empty file; that is, one containing no characters. CMS does not permit such files to exist. Additionally, for reasons explained in detail in Technical Background , under MVS, empty files with sequential organization are treated by the library as not existing. The ISO/ANSI C Standard permits this interpretation because of the existence of systems like CMS.

You can avoid this restriction in two ways:

You can write a single character to the file (for instance, a single '\0') and ignore this character when you read the file later.
Under MVS, investigate using a PDS member instead of a sequential file, as the restriction does not apply to PDS members. Because there are other restrictions for use of PDS members (such as not being able to add to the end of file), this solution is not feasible for some programs.

Diagnostic Messages

Q. I do not want the library to issue diagnostic messages when I/O errors occur, because my application has complete error-checking code. How do I suppress the library messages?

A. You can use the quiet function to suppress all library diagnostics or to suppress diagnostics at particular points in execution. If you use quiet, you may occasionally run into errors whose cause cannot be immediately determined. When this happens, you can use the =warning run-time option to override quiet and obtain a library diagnostic without having to recompile the program.

Converting an Assembler VSAM Application

Q. I'm converting an assembler VSAM application to SAS/C, and I need to know the return code set by VSAM when a function like kretrv or kinsert fails. How can I find this information?

A. When the SAS/C library invokes an operating system routine, such as a VSAM macro, and the macro fails, information about the failure is saved in a system macro information structure. You can access the name of the macro that most recently failed via the library macro __sysmi_macname and its return code via __sysmi_rc. For more information on this facility, see System Macro Information .

Sharing an Output PDS

Q. When I open a PDS member for output, the fopen call fails if another user has the PDS allocated, even if it is allocated as SHR. How can I write to the PDS if it shared with another user?

A. If more than one user writes to the same PDS at the same time, the results are unpredictable. Generally, both members will be damaged. For this reason, when a PDS member (or any other MVS data set) is opened for output, the library allocates the data set to OLD to make sure that no one else writes to it at the same time. In some cases, this may be overprotective, but it prevents file damage from unintended simultaneous access.

In your application, if you are certain that only one user can open the file for output at a time, you should access the file through a DDname rather than through a data set name. You can define the DDname using JCL or a TSO ALLOCATE command as SHR, and the library will not alter this allocation when the DDname is opened. In TSO, you can use the system function to allocate a data set to a specific DDname. Also, in any environment, you can use the osdynalloc function to dynamically allocate the data set.

Note: With a PDSE, it is possible to simultaneously write to distinct members. Even with a PDSE, the effects are unpredictable if the same member is opened by more than one user for output at the same time.