Chapter Contents

Previous

Next
I/O Functions

Technical Background

This section provides a fairly in-depth summary of the fundamentals of C I/O. It begins with a discussion of traditional C I/O concepts, then discusses UNIX low-level, ISO/ANSI, and IBM 370 I/O concepts. These concepts are combined in SAS/C I/O Concepts and 370 Perspectives on SAS/C Library I/O. The final section provides guidelines for choosing an I/O method, based on the needs of your application.


Traditional C (UNIX) I/O Concepts

When C was initially designed, no library, and therefore no I/O, was included. It was assumed that libraries suitable for use with particular systems would be developed. Because most early use of the C language was associated with UNIX operating systems, the UNIX I/O functions were considered the standard I/O method for C. As the C language has evolved, the I/O definition has changed to some extent, but understanding the underlying UNIX concepts is still important.

In addition, many useful C programs were first developed under UNIX operating systems, and such programs frequently are unaware of the existence of other systems or I/O techniques. Such programs cannot run on systems as different from UNIX as CMS or OS/390 without carefully considering their original environment.

The UNIX I/O model

The main features of the UNIX I/O model are as follows:


UNIX Low-Level I/O

One complication in programs developed under UNIX operating systems is that UNIX defines two different I/O interfaces: standard I/O and low-level I/O (sometimes called unbuffered I/O). Standard I/O is a more portable form of I/O than low-level I/O, and UNIX documentation recommends that portable programs be written using this form. However, UNIX low-level I/O is widely recognized as more efficient than standard I/O, and it provides some additional capabilities, such as the ability to test whether a file exists before it is opened. For these and other reasons, many programs use low-level I/O, despite its documented lack of portability.

UNIX operating systems also support a mixed-level form of I/O, wherein a file is accessed simultaneously with standard I/O and low-level I/O. C implementations that support the UNIX low-level functions may be unable to support mixed-level I/O, if the two forms of I/O are not closely related in the UNIX manner.

UNIX low-level I/O is not included in the ISO/ANSI C standard, so it may be unavailable with recently developed C compilers. Also, do not assume that this form of I/O is truly low-level on any system other than UNIX.


ISO/ANSI C I/O Concepts

The definition of the C I/O library contained in the ISO/ANSI C standard is based on the traditional UNIX standard I/O definition, but differs from it in many ways. These differences exist to support efficient I/O implementations on systems other than UNIX, and to provide some functionality not offered by UNIX. In general, where definitions of I/O routines differ between ISO/ANSI C and UNIX C, programs should assume the ISO/ANSI definitions for maximum portability. The ISO/ANSI definitions are designed for use on many systems including UNIX, while the applicability of the UNIX definitions is more limited.

Text access and binary access

In the UNIX I/O model, files are divided into lines by the new-line character ('\n'). For this reason, C programs that process input files one line at a time traditionally read characters until a new-line character is encountered. Similarly, programs that write output one line at a time write a new-line character after each line of data.

Many systems other than UNIX use other conventions for separating lines of text. For instance, the IBM PC operating system, PC DOS, separates lines of text with two characters, a carriage return followed by a line feed. The IBM 370 uses yet another method. To enable a line-oriented C program written for UNIX to execute under PC DOS, a C implementation must translate a carriage return and line feed to a new-line character on input, and must translate a new-line character to a carriage return and line feed on output. Although this translation is appropriate for a line-oriented program, it is not appropriate for other programs. For instance, a program that writes object code to a file cannot tolerate replacement of a new-line character in its output by a carriage return and a line feed. For this reason, most systems other than UNIX require two distinct forms of file access: text access and binary access.

The ISO/ANSI I/O definition requires that when a program opens a file, it must specify whether the file is to be accessed as a text stream or a binary stream. When a file is accessed as a binary stream, the implementation must read or write the characters without modification. When a file is accessed as a text stream, the implementation must present the file to the program as a series of lines separated by new-line characters, even if a new-line character is not used by the system as a physical line separator. Thus, under PC DOS, when a program writes a file using a binary stream, any new-line characters in the output data are written to the output file without modification. But when a program writes a file using a text stream, a new-line character in the output data is replaced by a carriage return and a line feed to serve as a standard PC DOS line separator.

If a file contains a real new-line character (one that is not a line separator) and the file is read as a text stream, the program will probably misinterpret the new-line character as a line separator. Similarly, a program that writes a carriage return to a text stream may generate a line separator unintentionally. For this reason, the ISO/ANSI library definition leaves the results undefined when any nonprintable characters (other than horizontal tab, vertical tab, form feed, and the new-line character) are read from or written to a text stream. Therefore, text access should be used only for files that truly contain text, that is, lines of printable data.

Programs that open a file without explicitly specifying binary access are assumed to require text access, because the formats of binary data, such as object code, vary widely from system to system. Thus, portable programs are more likely to require text access than binary access.

Padding

Many non UNIX file systems require files to consist of one or more data blocks of a fixed size. In these systems, the number of characters stored in a file must be a multiple of this block size. This requirement can present problems for programs that need to read or write arbitrary amounts of data unrelated to the block size; however, it is not a problem for text streams. When a text stream is used, the implementation can use a control character to indicate the logical end of file. This approach cannot be used with a binary stream, because the implementation must pass all data in the file to the program, whether it has control characters or not.

The ISO/ANSI C library definition deals with fixed data blocks by permitting output files accessed as binary streams to be padded with null ('\0') characters. This padding permits systems that use fixed-size data blocks to always write blocks of the correct size. Because of the possibility of padding, files created with binary streams on such systems may contain one or more null characters after the last character written by the program. Programs that use binary streams and require an exact end-of-file indication must write their own end-of-file marker (which may be a control character or sequence of control characters) to be portable.

A similar padding concern can occur with text access. Some systems support files where all lines must be the same length. (Files defined under OS/390 or CMS with record format F are of this sort.) ISO/ANSI permits the implementation to pad output lines with blanks when these files are written and to remove the blanks at the end of lines when the files are read. (A blank is used in place of a null character, because text access requires a printable padding character.) Therefore, portable programs cannot write lines containing trailing blanks and expect to read the blanks back if the file will be processed later as input.

Similarly, some systems (such as CMS) support only nonempty lines. Again, ISO/ANSI permits padding to circumvent such system limitations. When a text stream is written, the Standard permits the implementation to write a line containing a single blank, rather than one containing no characters, provided that this line is always read back as one containing no characters. Therefore, portable programs should distinguish empty lines from ones that contain a single blank.

Finally, some systems (such as CMS) do not permit files containing no characters. A program is nonportable if it assumes a file can be created merely by opening it and closing it, without writing any characters.

File positioning with fseek and ftell

As stated earlier, the UNIX I/O definition features seeking by character number. For instance, it is possible to position directly to the 10,000th character of a file. On a system where text access and binary access are different, the meaning of a request to seek to the 10,000th character of a text stream is not well defined. ftell and fseek enable you to obtain the current file position and return to that position, no matter how the system implements text and binary access.

Consider a system such as PC DOS, where the combination of carriage return and line feed is used as a line separator. Because of the translation, a program that counts the characters it reads is likely to determine a different character position from the position maintained by the operating system. (A line that the program interprets as n characters, including a final new-line character, is known by the operating system to contain n+ 1 characters.)

Some systems, such as the 370 operating systems, do not record physical characters to indicate line breaks. Consider a file on such a system composed of two lines of data, the first containing the single character 1 and the second containing the single character 2 . A program accessing this file as a text stream receives the characters { mono 1\n2\n}. The program must process four characters, although only two are physically present in the file. A request to position to the second character is ambiguous. The library cannot determine whether the next character read should be \n or 2 .

Even if you resolve the ambiguity of file positioning in favor of portability (by counting the characters seen by the program rather than physical characters), implementation difficulties may preclude seeking to characters by number using a text stream. Under PC DOS, the only way to seek accurately to the 10,000th character of a file is to read 10,000 characters because the number of carriage return and line feed pairs in the file is not known in advance. If the file is opened for both reading and writing, replacing a printable character with a new-line character requires replacing one physical character with two. This replacement requires rewriting the entire file after the point of change. Such difficulties make it impractical on many systems to seek for text streams based on a character number.

Situations such as those discussed in this section show that on most systems where text and binary access are not identical, positioning in a text stream by character number cannot be implemented easily. Therefore, the ISO/ANSI standard permits a library to implement random access to a text stream using some indicator of file position other than character number. For instance, a file position may be defined as a value derived from the line number and the offset of the character in the line.

File positions in text streams cannot be used arithmetically. For instance, you cannot assume that adding 1 to the position of a particular character results in the position of the next character. Such file positions can be used only as tokens. This means that you can obtain the current file position (using the ftell function) and later return to that position (using the fseek function), but no other portable use of the file position is possible.

This change from UNIX behavior applies only to text streams. When you use fseek and ftell with a binary stream, the ISO/ANSI standard still requires that the file position be the physical character number.

File positioning with fgetpos and fsetpos

Even with the liberal definition of random access to a text stream given in the previous section, implementation of random access can present major problems for a file system that is very different from that of a traditional UNIX file system. The traditional OS/390 file system is an example of such a system. To assist users of these file systems, the Standard includes two non UNIX functions, fsetpos and fgetpos .

File systems like the OS/390 file system have two difficulties implementing random access in the UNIX (ISO/ANSI binary) fashion:

The fsetpos and fgetpos functions did not exist prior to the definition of the ISO/ANSI C standard. Because many C libraries have not yet implemented them, they are at this time less portable than fseek and ftell , which are compatible with UNIX operating systems.

However, it is a relatively straightforward task to implement them as macros that call fseek and ftell in such systems. After these macros have been written, fsetpos and fgetpos are essentially as portable as their UNIX counterparts and will offer substantial additional functionality where provided by the library on systems such as OS/390.

The ISO/ANSI I/O model

The following list describes the I/O model for ISO/ANSI C. The points are listed in the same order as the corresponding points for the UNIX I/O model, as presented in the previous section.


IBM 370 I/O Concepts

Programmers accustomed to other systems frequently find the unique nature of 370 I/O confusing. This section organizes the most significant information about 370 I/O for SAS/C users. Note that this description is general rather than specific. Details and complex special cases are generally omitted to avoid obscuring the basic principles. See the introduction to the SAS/C Compiler and Library User's Guide for a small bibliography of relevant IBM publications that should be consulted for additional information.

Fundamental principles


File organizations under OS/390

Under OS/390, files are classified first by file organization. A number of different organizations are defined, each tailored for an expected type of usage. For instance, files with sequential organization are oriented towards processing records in sequential order, while most files with VSAM (Virtual Storage Access Method) organization are oriented toward processing based on key fields in the data.

For each file organization, there is a corresponding OS/390 access method for processing such files. (An OS/390 access method is a collection of routines that can be called by a program to perform I/O.) For instance, files with sequential organization are normally processed with the Basic Sequential Access Method (BSAM). Sometimes, a file can be processed in more than one way. For example, files with direct organization can be processed either with BSAM or with the Basic Direct Access Method (BDAM).

The file organizations of most interest to C programmers are sequential and partitioned. The remainder of this section relates primarily to these file organizations, but many of the considerations apply equally to the others. A number of additional considerations apply specifically to files with partitioned organization. These considerations are summarized in OS/390 partitioned data sets.

Note:    An important type of OS/390 file, the Virtual Storage Access Method (VSAM) file, was omitted from the previous list. VSAM files are organized as records identified by a character string or a binary key. Because these files differ so greatly from the expected C file organization, they are difficult to access using standard C functions. Because of the importance of VSAM files in the OS/390 environment, full access to them is provided by nonportable extensions to the standard C library.  [cautionend]

Note:    Also, if your system supports UNIX System Services (USS) OS/390, it provides a hierarchical file system similar to the system offered on UNIX. The behavior of files in the hierarchical file system is described in UNIX Low-Level I/O. Only traditional OS/390 file behavior is described here.  [cautionend]

The characteristics of a sequential or partitioned file are defined by a set of attributes called data control block (DCB) parameters. The three DCB parameters of most interest are record format (RECFM), logical record length (LRECL), and block size (BLKSIZE).

As stated earlier, OS/390 files are stored as a sequence of records. To improve I/O performance, records are usually combined into blocks before they are written to a device. The record format of a file describes how record lengths are allowed to vary and how records are combined into blocks. The logical record length of a file is the maximum length of any record in a file, possibly including control information. The block size of a file is the maximum size of a block of data.

The three primary record formats for files are F (fixed), V (variable), and U (undefined). Files with record format F contain records that are all of equal length. Files with format V or U may contain records of different lengths. (The differences between V and U are mostly technical.) Files of both F and V format are frequently used; the preferred format for specific kinds of data (for instance, program source) varies from site to site.

Ideally, the DCB parameters for a file are not relevant to the C program that processes it, but sometimes a C program has to vary its processing based on the format of a file, or to require a file to have a particular format. Some of the reasons for this are as follows:


File organizations under CMS

Like most operating systems, CMS has its own native file system. (In fact, it has two: the traditional minidisk file system and the more hierarchical shared file system.) Unlike most operating systems, CMS has the ability to simulate the file systems of other IBM operating systems, notably OS and VSE. Also, CMS can transfer data between users in spool files with the VM control program (CP).

Therefore, CMS files are classified first by the type of I/O simulation (or lack thereof) used to read or write to them. The three types are

CMS I/O simulation can be used to read files created by OS or VSE, but these operating systems cannot read files created by CMS, even when the files are created using CMS's simulation of their I/O system. In general, CMS adequately simulates OS and VSE file organizations, and the rules that apply in the real operating system also apply under CMS. However, the simulation is not exact. CMS's simulation differs in some details and some facilities are not supported at all.

CMS-format files, particularly disk files, are of most interest to C programmers. CMS disk files have a logical record length (LRECL) and a record format (RECFM). The LRECL is the length of the largest record; it may vary between 1 and 65,535. The RECFM may be F (for files with fixed-length records) or V (for files with variable-length records). Other file attributes are handled transparently under CMS. Files are grouped by minidisk, a logical representation of a physical direct-access device. The attributes of the minidisk, such as writability and block size, apply to the files it contains. Files in the shared file system are organized into directories, conceptually similar to UNIX directories.

Records in RECFM F files must all have the same LRECL. The LRECL is assigned when the file is created and may not be changed. Some CMS commands require that input data be in a RECFM F file. To support RECFM F files, a C implementation must either pad or split lines of output data to conform to the LRECL, and remove the padding from input records.

RECFM V files have records of varying length. The LRECL is the length of the longest record in the file, so it may be changed at any time by appending a new record that is longer than any other record. However, changing the record length of RECFM V files causes any following records to be erased. The length of any given record can be determined only by reading the record. (Note that the CMS LRECL concept is different from the OS/390 concept for V format files, as the LRECL under OS/390 includes extra bytes used for control information.)

Some rules apply for both RECFM F and RECFM V files. Records in CMS files contain only data. No control information is embedded in the records. Records may be updated without causing loss of data. Files may be read sequentially or accessed randomly by record number.

As under OS/390, files that are intended to be printed reserve the first character of each record for an ANSI carriage control character. Under CMS, these files can be given a filetype of LISTING, which is recognized and treated specially by commands such as PRINT. If a C program writes layout characters, such as form feeds or carriage returns, to a file to effect page formatting, the file should have the filetype LISTING to ensure proper interpretation by CMS.

Be aware that the standard C language does not provide any way for you to interrogate or define file attributes. In cases in which a program depends on file attribute information, you have two choices. You can use the FILEDEF command to define file attributes (if your program uses DDnames), or you can use nonportable mechanisms to access or specify this information during execution.

OS/390 partitioned data sets

As stated earlier, one of the important OS/390 file organizations is the partitioned organization. A file with partitioned organization is more commonly called a partitioned data set (PDS) or a library. A PDS is a collection of sequential files, called members, all of which share the same area of disk space. Each member has an eight-character member name. Under OS/390, source and object modules are usually stored as PDS members. Also, almost any other sort of data may be stored as a PDS member rather than as an ordinary sequential file.

Partitioned data sets have several properties that make them particularly difficult for programs that were written for other file systems to handle:

These limitations may cause ISO/ANSI-conforming programs to fail when they use PDS members as input or output files. For instance, it is reasonable for a program to assume that it can append data to the end of a file. But due to the nature of PDS members, it is not feasible for a C implementation to support this, except by saving a copy of the member and then replacing the member with the copy. Although this technique is viable, it is very inefficient in both time and disk space. (This tradeoff between poor performance and reduced functionality is one that must be faced frequently when using C I/O on the 370. PDS members, which are perhaps the most commonly used kind of OS/390 file, are the most prominent examples of such a tradeoff.)

Note:    Recent versions of OS/390 support an extended form of PDS, called a PDSE. Some of the previously described restrictions on a PDS do not apply to a PDSE. For example, unused space is reclaimed automatically in a PDSE.  [cautionend]

CMS MACLIBs and TXTLIBs

Two important OS-simulated file types on CMS are the files known as MACLIBs and TXTLIBs. Both of these are simulations of OS-partitioned data sets. MACLIBs are typically used to collect textual data or source code; TXTLIBs may contain only object code. Unlike OS PDS's, these files always have fixed-length, 80-character records.

In general, MACLIBs and TXTLIBs may not be written by OS-simulated I/O. Instead, data are added or removed a member at a time by CMS commands. Input from MACLIBs and TXTLIBs can be performed using either OS-simulation or native CMS I/O.

Identifying files

In UNIX operating systems and similar systems, files are identified in programs by name, and program flexibility with files is achieved by organizing files into directories. Files with the same name may appear in several directories, and the use of a command language to establish working directories enables the user of a program to define program input and output flexibly at run time.

In the traditional OS/390 file system, all files occupy a single name space. (This is an oversimplification, but a necessary one.) Programs that open files by a physical filename are limited to the use of exactly one file at a site. You can use several techniques to increase program flexibility in this area, none of which is completely satisfactory. These techniques include the following:

Under CMS, you can use other techniques to increase program flexibility:


File existence

Under OS/390, the concept of file existence is not nearly so clear-cut as on other systems, due primarily to the use of DDnames and control language. Since DDnames are indirect filenames, the actual filename must be provided through control language before the start of program execution. If the file does not already exist at the time the DD statement or ALLOCATE command is processed, it is created at that time. Therefore, a file accessed with a DDname must already exist before program execution.

An alternate interpretation of file existence under OS/390 that avoids this problem is to declare that a file exists after a program has opened it for output. By this interpretation, a file created by control language immediately before execution does not yet exist. Unfortunately, this definition of existence cannot be implemented because of the following technicalities:

A third interpretation of existence is to say that an OS/390 file exists if it contains any data (as recorded in the VTOC). This has the disadvantage of making it impossible to read an empty file but the much stronger advantage that a file created by control language immediately before program execution is perceived as not existing. (footnote 1)

This ambiguity about the meaning of existence applies only to files with sequential organization. For files with partitioned organization, only the file as a whole is created by control language; the individual members are created by program action. This means that existence has its natural meaning for PDS members, and that it is possible to create a PDS member containing no characters.

CMS does not allow the existence of files containing no characters, and it is not possible to create such a file.

Miscellaneous differences from UNIX operating systems

The following section lists some additional features of UNIX operating systems and UNIX I/O that some programmers expect to be available on the 370 systems. These features are generally foreign to the 370 environment and impractical to implement. Code that expects the availability of these features is not portable to the 370 no matter how successfully it runs on other architectures.


Summary of 370 I/O characteristics

The following list describes the characteristics of 370 files (without any special reference to the C language). The points are listed in the same order as the corresponding points for the UNIX and ISO/ANSI I/O models as presented earlier:


SAS/C I/O Concepts

In an ideal C implementation, C I/O would possess all three of the following properties:

For the reasons detailed in IBM 370 I/O Concepts, C I/O on the 370 cannot support all three of these properties simultaneously. The library provides several different kinds of I/O to allow the programmer to select the properties that are most important.

The library offers two separate I/O packages:

Details on both of these I/O packages are presented in the following sections. (footnote 2)

Standard I/O

Standard I/O is implemented by the library in accordance with its definition in the C Standard. A file may be accessed as a binary stream, in which case all characters of the file are presented to the program unchanged. When file access is via a binary stream, all information about the record structure of the file is lost. On the other hand, a file may be accessed as a text stream, in which case record breaks are presented to the program as new-line characters ('\n'). When data are written to a text file and then read, the data may not be identical to what was written because of the need to translate control characters and possibly to pad or split text lines to conform to the attributes of the file.

Besides the I/O functions defined by the Standard, several augmented functions are provided to exploit 370-specific features. For instance, the afopen function is provided to allow the program to specify 370-dependent file attributes, and the afread routine is provided to allow the program to process records that may include control characters. Both standard I/O functions and augmented functions may be used with the same file.

Library access methods

The low-level C library routines that interface with the OS/390 or CMS physical I/O routines are called C library access methods or access methods for short. (The term OS/390 access method always refers to access methods such as BSAM, BPAM, and VSAM to avoid confusion.) Standard I/O supports five library access methods: "term" , "seq" , "rel" , "kvs" , and "fd" . The file can span multiple volumes.

When a file is opened, the library ordinarily selects the access method to be used. However, when you use the afopen function to open a file, you can specify one of these particular access methods.

The "rel" access method Under OS/390, the "rel" access method can be used for files with sequential organization and RECFM F, FS, or FBS. (The limitation to sequential organization means that the "rel" access method cannot be used to process a PDS member.) Under CMS, the "rel" access method can be used for disk files with RECFM F. The "rel" access method is designed to behave like UNIX disk I/O:

Because of the nature of the 370 file system, complete UNIX compatibility is not possible. In particular, the following differences still apply:

The "kvs" access method The "kvs" access method processes any file opened with the extension open mode "k" (indicating keyed I/O). This access method is discussed in more detail in Using VSAM Files.

The "fd" access method The "fd" access method processes any file residing in the USS OS/390 hierarchical file system. These files are fully compatible with UNIX. In files processed with the "fd" access method, there is no difference between text and binary access.

The "seq" access method The "seq" access method processes a nonterminal non USS file if any one of the following apply:

In general, the "seq" access method is implemented to use efficient native interfaces, forsaking compatibility with UNIX operating systems where necessary. Some specific incompatibilities are listed here:


UNIX style I/O

The library provides UNIX style I/O to meet two separate needs:

As a result of the second property, UNIX style I/O is less efficient than standard I/O for the same file, unless the file is suitable for "rel" access, or it is in the USS hierarchical file system. In these cases, there is little additional overhead.

For files suitable for "rel" access, UNIX style I/O simply translates I/O requests into corresponding calls to standard I/O routines. Thus, for these files there is no decrease in performance.

For files in the USS hierarchical file system, UNIX style I/O calls the operating system low-level I/O routines directly. For these files, use of standard I/O by UNIX style I/O is completely avoided.

For other files, UNIX style I/O copies the file to a temporary file using the "rel" access method and then performs all requested I/O to this file. When the file is closed, the temporary file is copied back to the user's file, and the temporary file is then removed. This means that UNIX style I/O for files not suitable for "rel" access has the following characteristics:

All of the discussion within this section assumes that the user's file is accessed as a binary file: that is, without reference to any line structure. Occasionally, there are programs that want to use this interface to access a file as a text file. (Most frequently, such programs come from non UNIX environments like the IBM PC.)

As an extension, the library supports using UNIX style I/O to process a file as text. However, file positioning by character number is not supported in this case, and no copying of data takes place. Instead, UNIX style I/O translates I/O requests to calls equivalent to standard I/O routines.

Note that UNIX style I/O represents open files as small integers called file descriptors. Unlike UNIX, with OS/390 and CMS, file descriptors have no inherent significance. Some UNIX programs assume that certain file descriptors (0, 1, and 2) are always associated with standard input, output, and error files. This assumption is nonportable, but the library attempts to support it where possible. Programs that use file 0 only for input, and files 1 and 2 only for output, and that do not issue seeks to these files, are likely to execute successfully. Programs that use these file numbers in other ways or that mix UNIX style and standard I/O access to these files are likely to fail.

UNIX operating systems follow specific rules when assigning file descriptors to open files. The library follows these rules for USS files and for sockets. However, OS/390 or CMS files accessed using UNIX I/O are assigned file descriptors outside of the normal UNIX range to avoid affecting the number of USS files or sockets the program can open. UNIX programs that use UNIX style I/O to access OS/390 or CMS files may therefore need to be changed if they require the UNIX algorithm for allocation of file descriptors.


370 Perspectives on SAS/C Library I/O

This section describes SAS/C I/O from a 370 systems programmer's perspective. In contrast to the other parts of this chapter, this section assumes some knowledge of 370 I/O techniques and terminology.

OS/390 I/O implementation

Under OS/390, the five C library access methods are implemented as follows:

Although BDAM is not used by the "rel" access method, direct organization files that are normally processed by BDAM are supported, provided they have fixed-length records and no physical keys.

CMS I/O implementation

The C library access methods are implemented under CMS as follows:


File attributes for "rel" under OS/390

Under OS/390, a file can be processed by the "rel" access method if it is not a PDS or PDS member, and if it has RECFM F, FS, or FBS. These record formats ensure that there are no short blocks or unfilled tracks in the file, except the last, and make it possible to reliably convert a character number into a block address (in CCHHR form) for the use of XDAP. Use of "rel" may also be specified for regular files in the USS file system (in which case the "fd" access method is used).

If the LRECL of an FBS file is 1, then an accurate end-of-file pointer can be maintained without adding any padding characters. Because of the use of BSAM and XDAP to process the file, use of this tiny record size does not affect program efficiency (data are still transferred a block at a time). However, it may lead to inefficient processing of the file by other programs or languages, notably ones that use QSAM.

File attributes for "rel" access under CMS

Under CMS, a file can be processed by the "rel" access method if it is a CMS disk file (not filemode 4) with RECFM F. Use of RECFM F ensures that a character number can be converted reliably to a record number and an offset within the record.

If the LRECL of a RECFM F file is 1, then an accurate end-of-file pointer can be maintained without ever adding any padding characters. Because the file is processed in large blocks (using the multiple record feature of the FSREAD and FSWRITE macros), use of this tiny record size does not affect program efficiency. Nor does it lead to inefficient use of disk space, because the files are physically blocked according to the minidisk block size. However, it may lead to inefficient processing of the file by other programs or languages that process one record at a time.

Temporary files under OS/390

Temporary files are created by the library under two circumstances.

A program can create more than one temporary file during its execution. Each temporary file is assigned a temporary file number, starting sequentially at 1. When a temporary file is closed, its file number becomes available again. When a new temporary file is opened, the lowest available file number is assigned.

One of two methods is used to create a temporary file with number nn. First, a check is made for a SYSTMPnn DD statement. If the file number is larger than 99, the last two characters of the DDname are in an encoded form. If this DDname is allocated and references a temporary data set, this data set is associated with the temporary file. If no SYSTMPnn DDname is allocated, the library uses dynamic allocation to create a new temporary file whose data set name depends on the file number. (The system is allowed to select the DDname, so there is no dependency on the SYSTMPnn style of name.) The data set name depends also on information associated with the running C programs, so that several C programs can run in the same address space without conflicts occurring between temporary filenames.

Note:    If an attempt is made to open temporary file nn and a SYSTMPnn DD statement of the appropriate kind is defined but the file is in use by another SAS/C program running in the same address space, the file number is considered to be unavailable, and the lowest available file number not in use by another such program is used instead.  [cautionend]

If a program is compiled with the posix compiler option, then temporary files are created in the USS hierarchical file system, rather than as OS/390 temporary files. The USS temporary files are created in the /tmp HFS directory.

Temporary files are normally allocated using a unit name of VIO and a space allocation of 50 tracks. The unit name and default space allocation can be changed by a site, as described in the SAS/C installation instructions. If a particular application requires a larger space allocation than the default, use of a SYSTMPnn DD statement specifying the required amount of space is recommended.

Temporary files under CMS

Temporary files are created by the library under two circumstances.

A program can create more than one temporary file during its execution. Each temporary file is assigned a temporary file number, starting sequentially at 1.

One of two methods is used to create the temporary file whose number is nn. First, a check is made for a FILEDEF of the DDname SYSTMPnn. If this DDname is defined, then it is associated with the temporary file. If no SYSTMPnn DDname is defined, the library creates a file whose name has the form $$$$$$nn $$$$xxxx, where nn is the temporary file number, and the xxxx part of the filetype is associated with the calling C program. This naming convention allows several C programs to execute simultaneously without conflicts occurring between temporary filenames.

Temporary files are normally created by the library on the write-accessed minidisk with the most available space. Using FILEDEF to define a SYSTMPnn DDname with another filemode allows you to use some other technique if necessary.

Be aware that these temporary files are not known to CMS as temporary files. Therefore, they are not erased if a program terminates abnormally or if the system fails during its execution.

VSAM usage and restrictions

The SAS/C library supports two different kinds of access to VSAM files: standard access, and keyed access. Standard access is used when a VSAM file is opened in text or binary mode, and it is limited to standard C functionality. Keyed access is used when a VSAM file is used in keyed mode. Keyed mode is discussed in detail in Using VSAM Files.

Any kind of VSAM file may be used via standard access. Restrictions apply to particular file types, for example, a KSDS may not be opened for output using standard I/O.

VSAM ESDS, KSDS and RRDS files are processed using a single RPL. Move mode is used to support spanned records. A VSAM file cannot be opened for write only (open mode "w" ) unless it was defined by Access Method Services to be a reusable file.


FOOTNOTE 1:   This is the interpretation used in the SAS/C implementation. [arrow]

FOOTNOTE 2:   Two other I/O packages are provided: CMS low-level I/O, defined for low-level access to CMS disk files, and OS low-level I/O, which performs OS-style sequential I/O. These forms of I/O are nonportable and are discussed in Chapter 2, "CMS Low-Level I/O Functions," and Chapter 3, "OS/390 Low-Level I/O Functions," in SAS/C Library Reference, Volume 2. [arrow]

FOOTNOTE 3:   The library connects the use of the UNIX low-level I/O interface and the ability to do seeking by character number because UNIX documentation has traditionally stressed that seeking by character number is not guaranteed when standard I/O is used. The UNIX Version 7 Programmer's Manual states that the file position used by standard I/O "is measured in bytes only on UNIX; on some other systems it is a magic cookie." [arrow]


Chapter Contents

Previous

Next

Top of Page

Copyright © 2001 by SAS Institute Inc., Cary, NC, USA. All rights reserved.