WHERE Clauses in SAS and where-clauses in SYSTEM 2000

Overview of WHERE Clauses

In addition to, or instead of including a SYSTEM 2000 where-clause in your view descriptor for selection criteria, you can specify a WHERE clause in a SAS program for selection criteria.
Note: Unlike a SYSTEM 2000 where-clause that is stored in a view descriptor, a WHERE clause in SAS is restricted to variables that correspond to items included in the view descriptor. (A SYSTEM 2000 where-clause can reference items that are contained in a view descriptor and items that are contained in the access descriptor that the view descriptor is based on.)
When you specify a WHERE clause, the SAS/ACCESS interface view engine translates the specified conditions into SYSTEM 2000 conditions. If the view descriptor includes a SYSTEM 2000 where-clause, the interface view engine connects the conditions with the Boolean operator AND. By default, the conditions in the WHERE clause in SAS are connected to the end of the view descriptor conditions. For example, if a view descriptor includes the condition
   sex=female
and the WHERE clause condition in SAS translates into
   position=marketing
the resulting selection criteria are
   sex=female and position=marketing
You can control the connection of the translated WHERE clause in SAS and the SYSTEM 2000 where-clause conditions by including a connecting string in a SYSTEM 2000 where-clause that is included in a view descriptor. A connecting string indicates where you want the connection to occur. For example, if you include the following SYSTEM 2000 where-clause in a view descriptor (*SASAND* is a connecting string),
   *sasand* department=marketing
and execute a SAS procedure that includes a WHERE clause that produces the following condition:
   salary gt 1000
The resulting selection criteria are
   salary gt 1000 and department=marketing
For more information and examples, see Connecting Strings to Order Conditions.
Because there are capabilities in the WHERE clause in SAS that are not available in SYSTEM 2000, when the interface view engine translates the WHERE clause conditions in SAS into SYSTEM 2000 conditions, it is possible that the WHERE clause in SAS cannot be totally executed in SYSTEM 2000.
For this possibility, the interface view engine first evaluates the WHERE clause in SAS and determines which conditions SYSTEM 2000 can support. The interface view engine might be able to partially execute the WHERE clause. For example, in the following program:
    proc print data=vlib.emp1;
    where lastname < 'KAP' 
      and payrate > 30 * overtime;
    run; 
the interface view engine translates as much of the WHERE clause as possible, without producing incorrect results or a syntax error in SYSTEM 2000. In this example, SYSTEM 2000 can execute the first condition, but the arithmetic in the second condition is not supported. Therefore, the engine uses where lastname < 'KAP' to filter out as many data records as possible to improve performance. The conditions that are not supported are bypassed by the engine, and post-processing (performed automatically by SAS) will be required after SYSTEM 2000 completes its subsetting. The engine bypasses:
  • unacceptable conditions.
  • conditions connected by OR to unacceptable conditions.
  • conditions that exceed the 1000-byte limit of a SYSTEM 2000 where-clause. If the WHERE clause in SAS exceeds 1000 bytes, the rightmost portion of the clause is bypassed by SYSTEM 2000.
    When the interface view engine first examines the WHERE clause in SAS and determines which conditions SYSTEM 2000 can support, the engine has not yet processed the view descriptor where-clause. Later, when the engine processes the view descriptor where-clause, the possibility arises that the combined length of the WHERE clause conditions in SAS that can be supported in SYSTEM 2000 and the view descriptor where-clause conditions might exceed 1000 bytes.
    If the engine determines that SYSTEM 2000 completely supports the WHERE clause in SAS, but also determines that the conditions cannot be combined due to the 1000-byte limit, an unrecoverable error occurs. To the SAS procedure or DATA step, it appears as if the first "read" observation failed. You might need to carefully examine the error messages in the log to find out what actually happened.
Note: If there is no SYSTEM 2000 where-clause included in the view descriptor and no WHERE clause specified in the SAS program, the interface view engine issues a default where-clause in the form of WHERE Cn EXISTS OR Cn FAILS, where Cn is a component in the lowest-level record in the view descriptor.
The default where-clause “WHERE Cn EXISTS OR Cn FAILS” guarantees that the view retrieves 100% of the database defined by that view, but it does cause a complete non-key pass of the database. A more efficient default where-clause can be defined by using the following syntax:
*DEFAULT (WHERE valid-subset where-clause)
As a knowledgeable user of your database, you might be able to define a where-clause using all key components that still guarantee that you retrieve 100% of the database defined by this view.
The *DEFAULT where-clause is validated by SYSTEM 2000 at run time. When you specify a SAS WHERE clause, the *DEFAULT is not used. However, when you open a view that has *DEFAULT specified and do not specify a SAS WHERE clause, *DEFAULT is used to qualify the data. The qualified data is passed to the engine for processing by the application or procedure. If a subsequent SAS WHERE clause is specified, the new WHERE clause is the only qualification that is sent to SYSTEM 2000 for retrieval.
In the following table, assume that C114 is a component in the lowest-level record of a view descriptor.
Translating SYSTEM 2000 where-clause and WHERE Clauses in SAS
where-clause in SYSTEM 2000 View Descriptor
WHERE Clause in SAS
SYSTEM 2000 Translation
Post-Processing Required?
C1=A
C2=B OR C3>C4+10
(C1=A)
Yes
C1=A
C2=B & C3>C4+10
(C1=A) & (C2=B)
Yes
C1=A
C2=B OR C3>C4
(C1=A) & (C2=B OR C3*>C4*)
No
C1=A
C2=B & C3
(C1=A) & (C2=B)
Yes
---
---
C114 EXISTS OR C114 FAILS
No
---
C3*20 < C5
C114 EXISTS OR C114 FAILS
Yes
---
C3 = C5
C3* = C5*
No

WHERE Clauses in SAS Translatable to SYSTEM 2000

The next three tables show the interface view engine translations of acceptable WHERE clause conditions in SAS into where-clause conditions in SYSTEM 2000.
SAS Operators Translated into SYSTEM 2000 Operators
WHERE Clause Operators in SAS
SYSTEM 2000 Operators
=
=
>
>
<
<
<>
!=
>=
>=
<=
<=
IS NULL
FAILS
IS NOT NULL
EXISTS
(
(
)
)
AND
AND
OR
OR
Additional SAS Syntax Translations into SYSTEM 2000
WHERE Clause Syntax in SAS
SYSTEM 2000 Translation
C1 BETWEEN 1 AND 3
C1 = 1*3
C1 IN (4,9,14)
C1=4 OR C1=9 OR C1=14
C4 > C5
C4* > C5*
C4 = '02AUG87'D
C4 = 08/02/1987
SYSTEM 2000 can handle a limited subset of WHERE clause pattern matching specified in SAS, under the following conditions:
  • The pattern must be less than 100 characters in length.
  • The pattern must have a percent sign (%) as the last character.
  • Underscores (_) are permitted only in the beginning position(s).
  • The pattern cannot have a percent sign (%) anywhere except in the beginning or in the last position.
  • The pattern must have some characters that are not percent signs (%) or underscores (_).
SAS Pattern Syntax Translated to SYSTEM 2000
WHERE Clause Syntax in SAS
SYSTEM 2000 Translation
C1 LIKE %ABC%
C1 CONTAINS ABC
C1 LIKE ABC%
C1 CONTAINS ABC IN 1
C1 LIKE _ABC%
C1 CONTAINS ABC IN 2
C1 LIKE __ABC%
C1 CONTAINS ABC IN 3

WHERE Clauses in SAS Not Translatable to SYSTEM 2000

Here are some (but not all) WHERE clause conditions in SAS that are not accepted in SYSTEM 2000. They are executed automatically by SAS post-processing:
  • arithmetic expressions such as the following example:
    WHERE C1 = C4 * 3
    WHERE C4 < -C5
  • expressions in which a variable or combination of variables assumes a value of 1 or 0 to signify true or false, such as in the following example:
    WHERE C1 
    WHERE (C1 = C2) * 20
  • concatenation of character variables.
  • truncated comparison, as in the following example:
        C1 =: ABC
  • DATETIME and TIME formats, as in the following example:
    '12:00'T 
    '01JAN60:12:00'DT
  • SOUNDEX.
  • HAVING, GROUP BY, and NOT CONTAINS conditions.
  • references to nulls indicated by a period (.) for numeric variables or closing quotation marks ('') for character variables. Use WHERE C1 IS NULL, do not use WHERE C1 = . or ''to indicate a null. The interface view engine can translate C1 IS NULL into C1 FAILS.

NOT Operator in SAS and SYSTEM 2000

The WHERE clause NOT operator in SAS and the where-clause NOT operator in SYSTEM 2000 do not function the same way. If you want NOT to have its SAS meaning, put it in the WHERE clause in SAS. If you want NOT to have its SYSTEM 2000 meaning, put it in the view descriptor where-clause in SYSTEM 2000.
If you specify NOT in a WHERE clause in SAS, NOT is transformed by the WHERE clause parser in SAS; the interface view engine never sees the NOT operator.
Examples of the NOT Operator in SAS
WHERE Clause in SAS
What the Interface View Engine Sees
WH NOT LASTNAME = 'Jones';
WH LASTNAME NE 'Jones';
WH NOT LASTNAME > 'Baker';
WH LASTNAME <= 'Baker';
WH NOT (LASTNAME = JONES
AND HIREDATE > '02aug82'd);
WH LASTNAME NE 'Jones' OR HIREDATE <= '02aug82'd;
In SYSTEM 2000, the logical converse of wh not lastname = 'Jones'; is wh lastname ne Jones or lastname fails. Before any relational operator can find a match for a value, the value must exist. One reason for this is that nulls are not contained in SYSTEM 2000 indexes, and processing an operator such as NE could be expensive if it were not confined to indexed values.