What is Virtual Storage Access Method (VSAM)?

Introduction to VSAM

VSAM is an IBM data access method that enables you to organize and access records in a disk data set. VSAM is available under the z/OS operating environment. There are three types of data set organization:
  • Entry-Sequenced Data Set (ESDS)
  • Key-Sequenced Data Set (KSDS)
  • Relative-Record Data Set (RRDS)
VSAM has three types of access to records in VSAM data sets:
  • sequential
  • direct
  • skip sequential
In addition, VSAM provides the following access and retrieval options:
  • two direct access modes (addressed or keyed)
  • two access entities (logical records and control intervals)
  • two access directions (forward and backward)
  • retrieval options (such as generic key and key greater-than-or-equal)
SAS supports all of these VSAM features, although not necessarily in all possible combinations. By specifying options in the INFILE statement in your SAS program, you can read, update, create, and erase records from VSAM data sets. See Supported VSAM Operations and Access Types for a summary of the operations that SAS supports.

Access Methods

Access methods are software routines that control the data transfer between primary storage (main memory) and secondary storage devices. Secondary, or auxiliary, storage is independent of the computer's memory (for example, storage on tape or disk). VSAM is designed specifically for use with disks. Because VSAM data set structure permits the use of both direct and sequential access types, you can select either the type or the combination of access types that best suits your specific application requirements.
Direct access means that you have the ability to read any data record in a data set directly, without reading preceding records in the data set. For more information, see Direct Access. (The terms direct and random are sometimes used interchangeably when referring to data organization, access methods, and storage devices. SAS documentation uses the term direct, but you might find that random is used in other literature.)
Sequential access means that you retrieve a series of records in sequence. Sequence has a different meaning for each of the three VSAM data set organizations. For more information, see Sequential Access.
Skip sequential access means that you use a combination of both direct and sequential access. For more information, see Skip Sequential Access.

Access Methods and File Organization

Data stored on IBM disks can be organized in a number of ways, which are referred to as data set types. IBM software supports the following data set types:
  • Physical Sequential (PS)
  • Partitioned Organization (PO)
  • Indexed Sequential (IS)
  • Direct Access (DA)
  • Virtual Storage Access Method (VSAM)
VSAM data sets can be one of the following:
  • Entry-Sequenced Data Set (ESDS)
  • Key-Sequenced Data Set (KSDS)
  • Relative-Record Data Set (RRDS)
In each data set type except VSAM, the records are organized in a unique way, depending on their purpose. Each type of data set organization has one or more special access methods. (For example, a data set that uses DA organization is characterized by a predictable relationship between the key of a record and the address of that record on a DASD device.) The programmer establishes this relationship and must supply most of the logic required to locate the individual records.
VSAM is a multifunction, all-purpose access method. VSAM is different from the other data set types because it provides a functional equivalent for most of the other data set organizations, as follows:
  • ESDS organization is the functional equivalent of Physical Sequential organization (PS).
  • KSDS organization is the functional equivalent of Indexed Sequential organization (IS).
  • RRDS organization is the functional equivalent of Direct Access organization (DA).
The types of data set organizations that you access with VSAM differ from others for two reasons:
  • They are device independent.
  • They can be both sequentially and directly accessed.
You access a record by addressing the record in terms of its displacement (in bytes) from the beginning of the data set, by its key, or by its record number.
The root of the VSAM access method is the VSAM catalog, which is a disk area for defining data sets and disk space and for maintaining information about each VSAM data set. VSAM catalogs and data sets are created and managed with IBM Access Method Services (AMS), a multifunction service program.

Types of VSAM Data Sets

There are three types of VSAM data sets. The main difference between the three data set types is the logical order in which data records are arranged in the data set. The following is a description of each type of VSAM data set:
ESDS
(Entry-Sequenced Data Set) The record sequence is determined by the order in which the records are entered into the data set, without respect to the record contents. New records are stored at the end of the data set.
An ESDS is appropriate for applications that do not require any particular ordering of the data by the record contents or for those that require time-ordered data. Applications that use a log or journal are suitable for an ESDS data set structure.
KSDS
(Key-Sequenced Data Set) The record sequence is determined by a key containing a unique value, such as an employee, invoice, or transaction number. The key is a contiguous portion of the record and is defined when the data set is created. The record order is defined by the EBCDIC collating sequence of the key field contents.
A KSDS is always defined with a prime index that relates the record's key value to its relative location in the data set. VSAM uses the index to locate a record for retrieval and to locate a collating position for record insertion.
A KSDS is the most flexible approach for most applications because the record can be accessed directly via the key field. Access is not dependent on the physical location of the record in the data set.
RRDS
(Relative-Record Data Set) The data set is a string of fixed-length slots, each identified by a relative-record number (RRN). Each slot can either contain a record or be empty. Records are stored and retrieved by the relative-record number of the slot.
An RRDS is appropriate for many applications using fixed-length records or when the record number has a contextual meaning that can be used as a key.
The figure below, VSAM Data Set Organization: Data Components and Index Components shows how the three types of VSAM data sets are organized.
When a VSAM data set is created, it is defined in a cluster. A cluster encompasses the components of a VSAM data set. ESDS and RRDS clusters have only a data component. A KSDS cluster has a data component and an index component. The index relates each record's key to its location in the data set. VSAM uses the index to sequence and locate the records of a KSDS.
The following figure summarizes the differences between the three VSAM data set types.
VSAM Data Set Organization: Data Components and Index Components
Differences between the ESDS, KSDS, and RRDS clusters
Comparison of VSAM Data Set Types
ESDS
KSDS
RRDS
What is the method for sequential access?
Entry order
Primary key order
RRN2
What is the method for direct access?
RBA2
Key RBA
RRN
What are the types of record format?
Fixed
Variable
Spanned
Fixed
Variable
Spanned
Fixed
Is record length changeable?
No
Yes
No
Where are new records added?
End of file
Anywhere
RRN slot
(if empty)
Is embedded free space defined?1
No
Yes
No
Can you delete records and reuse space?
No3
Yes
Yes
Can you access the data set through an alternate index?
Yes
Yes
No
Can you REUSE the file?
Yes
(if no AIX2)
Yes
(if no AIX)
Yes
Can RBA or RRN change?
No
Yes
No
1You can insert records and change their lengths.
2RRN= relative-record number, RBA= relative-byte address, and AIX= alternative index.
3You can, however, overlay a record if the length does not change.

VSAM Record Structure and Organization

Records in VSAM data sets are grouped into control intervals, the units of data transfer between main storage and secondary disk storage. Control intervals are continuous areas of direct access storage that VSAM uses for storing records and to control information describing them. Although the size of control intervals varies from one data set to another, the size within a data set is fixed, either by VSAM or by you (within VSAM imposed restrictions). If VSAM chooses the size, it does so based on the DASD type, record size, and smallest amount of virtual storage space that the user applications make available for I/O buffers. A spanned record is one that exceeds the established control interval size by spanning one or more control interval boundaries. Spanned records are permitted in an ESDS and a KSDS, but not in an RRDS.
Control intervals are grouped into control areas. Control areas are the units of a data set that VSAM preformats as records are added to the data set. VSAM fixes the number of control intervals for each control area. (See ESDS Control Intervals and Control Areas, KSDS Control Intervals and Control Areas, and RRDS Control Intervals and Control Areas for depictions of the control interval formats used by each of the data set types.) KSDS control areas are used for distributing free space throughout the data set, as a percentage of control intervals per control area.
ESDS Control Intervals and Control Areas
ESDS file control areas
KSDS Control Intervals and Control Areas
KSDS file control areas
RRDS Control Intervals and Control Areas
RRDS file control areas