Performance and Efficient View Descriptors

General Information

When you create and use view descriptors, follow these guidelines to minimize the use of IMS and z/OS system resources and to reduce the time IMS takes to access data.
Select only the items your program needs. Selecting unnecessary fields adds extra processing time.
Sorting data can be resource-intensive, even if it is done using the SORT procedure. You should sort data only when sorted data is needed for your program. Note that IMS does not support the ORDER BY clause or a BY statement in an application, such as PROC PRINT ... BY variable...;. If you have an IMS database that does not have an index and you want to use a SAS procedure that requires the data to be sorted, you must first extract the data to sort it. If you have an IMS database that does have an index and you want to use a BY variable other than an index key, you must also extract the data to sort it before executing the SAS procedure.
Where possible, specify selection criteria that can be converted into SSAs to subset the amount of data IMS returns to SAS.

Extracting Data Using a View

If a view descriptor describes a large IMS database and you use the temporary or permanent view descriptor many times, it might be more efficient to extract the data and place it in a SAS data file. Under the following circumstances, you should probably extract data:
  • If you plan to use the same IMS data in several procedures, you might improve performance by extracting it. Placing the data into a SAS data file requires disk space to store the data and I/O to write the data. However, SAS data files are organized to provide optimal I/O performance with PROC and DATA steps. Programs using SAS data files often use less CPU time than programs that directly read IMS data.
  • If you plan to read a large amount of data from a large IMS database and the database is being shared by several users, your direct reading of the data could adversely affect all users' response time. Extracting data can improve response time.
  • If you think directly reading this data would present a security risk, you might want to extract the data and not distribute information about either the access descriptor or view descriptor.

Deciding How to Subset Your Data

There are many reasons why you might want to subset or filter the data that is being returned from a database path that is defined by a view descriptor. The main benefit is performance. Retrieving a portion of the data in the database path is more efficient than retrieving all of the data in the path. Another reason is to enforce security measures, such as restricting users of view descriptors to certain subsets of data.
Once you determine that your application can benefit from using a subset of data, there are several ways that you can subset data in SAS. Use the following guidelines to determine when to use a view descriptor WHERE expression, an application WHERE expression, or a DATA step subsetting IF statement, and when to use a combination of the methods.
Note: Regardless of the method that you choose, for performance reasons that you should always attempt to choose selection criteria that can be converted by the engine into SSAs. If the engine cannot build SSAs for your data request, then a sequential access method is used to retrieve all path data that is defined by the view descriptor.

View Descriptor WHERE Expression

Include a WHERE expression in your view descriptor by using a SUBSET statement when you want to do the following tasks:
  • have selection criteria that you want to always apply, regardless of the application that references the view descriptor.
  • restrict access to data in a way that the selection criteria cannot be viewed, modified, or deleted.
Selection criteria stored in a view descriptor can be protected with a password as well as with operating system security. If an application specifies additional subset criteria, it is combined with the view descriptor selection criteria and treated as an AND search argument.

Application WHERE Expression

Use an application WHERE expression (SAS WHERE statement, clause, or data set option) when the guidelines specified in the previous section do not apply and you meet the following criteria:
  • you want to use the same view descriptor for various tasks (includes DATA steps, procedures, and SCL), where each requires a different subset of data
  • you need to generate dynamic selection criteria for the data that is defined by the view descriptor.

DATA Step IF Statement

Use a subsetting IF statement in a DATA step execution when you meet the following criterion:
  • you need to impose selection criteria that would result in a sequential retrieval of the data that is defined by the view descriptor. This type of criteria does not meet SSA eligibility requirements.
The IMS engine generates SSAs only when all of the conditions in a WHERE expression meet eligibility requirements. The DATA step IF statement enables you to perform filtering that does not meet SSA eligibility requirements, while using a view descriptor WHERE expression or application WHERE expression to obtain the performance benefits from SSAs.

Combination of Methods

There are some comparison operators in SAS that cannot be incorporated into SSAs for DL/I function calls and that cannot be used with the DATA step IF statement. In these cases, you have to evaluate the impact of a sequential retrieval to determine whether that method is acceptable. If it is not, then you can extract a subset of view descriptor data into a SAS data set (or define a DATA step view) using eligible selection criteria, then subset the data set using an application task to achieve the desired performance gains.
If needed, you can mix all of the filtering methods:
data work.subset;
  set vlib.imsview; /*View can contain subset criteria*/

  where (additional eligible conditions for IMS SSAs);
  if (ineligible criteria that would not generate SSAs);
run;
  
For all methods, it is possible that a change in criteria can cause an application that once produced SSAs to no longer produce them and resort to using a sequential access method. You can prevent this from happening with the SAS system option IMSWHST=Y. IMSWHST= is an invocation option that can be placed in the restricted options table so that it cannot be changed or overridden. Should the engine detect that no SSAs can be generated when this option is in effect, it issues a message to the SAS log and terminate the executing task.

Writing Efficient WHERE Statements

Specifying a WHERE statement from which the IMS engine can generate SSAs improves performance. The IMS engine returns to SAS only those database segments that meet your selection criteria. If the IMS engine cannot generate SSAs, all segment occurrences for each IMS record (as defined by the path of segments in the view descriptor) are returned to SAS for further processing.
To determine whether SSAs are being generated by your WHERE statement, set the option IMSDEBUG=Y or set the number of calls for which you want debugging information.
To ensure that your WHERE statements generate SSAs, do the following:
  • When creating descriptors, specify a search field name for all variables that you plan to include in your application's WHERE statements, when possible.
  • Use one of the eight operators supported by IMS in your WHERE statements. The eight operators supported by IMS are listed in the following table, along with their alternate forms.
    IMS Supported Operators
    Operator
    Alternate Form
    =1
    EQ
    >1
    GT
    <1
    LT
    >=
    => or GE
    <=
    =< or LE
    ¬=
    =¬ or NE
    &
    * or AND (dependent AND)
    |
    + or OR (logical OR)
    1Pad the =, >, and < operators with blanks on the right or left.
The ability of the IMS engine to generate SSAs also depends on the database type and on the operators that you use in your WHERE expression.
  • For GSAM databases, no SSAs can be generated.
  • For other database types, the following rules apply:
    • SSAs are generated only for WHERE expressions that involve a variable, an operator, and a literal value. Multiple expressions that use Boolean operators are also used. For example:
      where partnum > 1000
      where partnum > 1000 and 
                      orddate = '31JAN94'd
    • The following operators generate SSAs: = (EQ), > (GT), < (LT), >= (GE), <= (LE), IN, BETWEEN, IS NULL, and IS MISSING. For HDAM databases, only the equals (=), IS MISSING, and IN operators generate SSAs.
    • Compound expressions generate SSAs, except when the expressions are joined by OR and the fields involved are in different segments.

Identifying Inefficient SAS WHERE Conditions

When your view descriptor uses WHERE clauses that have multiple values for a search field, and specifies a path that does not originate from the root segment in the IMS database, it forces the IMS engine to reposition itself to the beginning of the IMS database for each value.
In this example, the WHERE statement tries to find two checking account records in the AcctDBD database.
where chckacct = '345620145345' 
   or chckacct = '345620134663';
Because the CUSTOMER segment is the root segment and the CHCKACCT segment is a child of CUSTOMER, the IMS engine must issue a GU call for each checking account number that it wants to find. It does this in order to reposition itself at the start of the database. If it used GN calls, it might pass by one of the records because they are not sequential.
Specifying multiple values for a search field in a WHERE statement for HDAM IMS databases permits the IMS engine to create a WHERE key list. The IMS engine issues calls that use, at a minimum, the first segment level SSA with a WHERE key list value. When no more data is retrieved from the IMS database for a WHERE key list value, a GU call is used to reposition to the beginning of the database and the next WHERE key list value is used. Processing stops when all WHERE key list values have been used.
The following conditions do not enable the IMS engine to generate SSAs. They cause all data from the IMS database as defined by the view descriptor to be returned to SAS for further processing:
  • HDAM WHERE statements that use a WHERE key list and an OR operator with another search field or key list in the first segment level of the view descriptor, for example:
    where custcode in ('24589689' '29834248') 
       | state in ('CA' 'VA');
  • an OR between two segment levels

Identifying SAS WHERE Conditions That Are Not Acceptable to IMS

The following examples are SAS WHERE conditions that are passed to SAS for further processing.
  • arithmetic expressions, for example:
    where c1=c4*3
    where c4−c5
  • expressions in which a variable or combination of variables assumes a value of 1 or 0 to signify true or false, for example:
    where c1
    where (c1=c2)*20
  • concatenation of character variables: where c2=D2||D3.
  • LIKE, BETWEEN, CONTAINS, SOUNDS LIKE operators, for example:
    where lastname=*'SMITH'
    where lastname like 'D_A%'
  • truncated comparison: where cl=:abc.
  • DATETIME and TIME formats, for example:
    where ctime= '12:00't
    where ctime= '01jan60:12:00'dt
  • comparisons using operators other than equivalence (=) for character variables, for example:
    where name>'A'
    where ssn<='251-09-7384'
  • comparisons using operators other than equivalence (=) for date variables not in the YYMMDD format: where stmtdate>'01JAN01'D. STMTDATE has a DB Content of MMDDYY6.
  • references to missing values. This includes the period (.) for numeric variables, and the IS MISSING and IS NULL operators.
    where stmtdate = .(numeric)
    
    where name = (character)
  • OR requests for conditions in two hierarchical levels of the database: where name='Smith' or stmtamt>0. In this example, the NAME field is in the root segment, and the STMTAMT field is in a child segment.
  • any WHERE statement for a GSAM database: where var1<200.
  • Any reference to a variable that does not have a SEARCH or SEQ field assigned to it in the access descriptor.