Advanced Topics for Users |
Overview of WHERE Clauses |
In addition to, or instead of including a SYSTEM 2000 where-clause in your view descriptor for selection criteria, you can specify a WHERE clause in a SAS program for selection criteria.
Note: Unlike a SYSTEM 2000 where-clause that is stored in a view descriptor, a WHERE clause in SAS is restricted to variables that correspond to items included in the view descriptor. (A SYSTEM 2000 where-clause can reference items that are contained in a view descriptor and items that are contained in the access descriptor that the view descriptor is based on.)
When you specify a WHERE clause, the SAS/ACCESS interface view engine translates the specified conditions into SYSTEM 2000 conditions. If the view descriptor includes a SYSTEM 2000 where-clause, the interface view engine connects the conditions with the Boolean operator AND. By default, the conditions in the WHERE clause in SAS are connected to the end of the view descriptor conditions. For example, if a view descriptor includes the condition
sex=female
and the WHERE clause condition in SAS translates into
position=marketing
the resulting selection criteria are
sex=female and position=marketing
You can control the connection of the translated WHERE clause in SAS and the SYSTEM 2000 where-clause conditions by including a connecting string in a SYSTEM 2000 where-clause that is included in a view descriptor. A connecting string indicates where you want the connection to occur. For example, if you include the following SYSTEM 2000 where-clause in a view descriptor (*SASAND* is a connecting string),
*sasand* department=marketing
and execute a SAS procedure that includes a WHERE clause that produces the following condition:
salary gt 1000
The resulting selection criteria are
salary gt 1000 and department=marketing
For more information and examples, see Connecting Strings to Order Conditions.
Because there are capabilities in the WHERE clause in SAS that are not available in SYSTEM 2000, when the interface view engine translates the WHERE clause conditions in SAS into SYSTEM 2000 conditions, it is possible that the WHERE clause in SAS cannot be totally executed in SYSTEM 2000.
For this possibility, the interface view engine first evaluates the WHERE clause in SAS and determines which conditions SYSTEM 2000 can support. The interface view engine might be able to partially execute the WHERE clause. For example, in the following program:
proc print data=vlib.emp1; where lastname < 'KAP' and payrate > 30 * overtime; run;
the interface view engine translates as much of the WHERE clause as possible, without producing incorrect results or a syntax error in SYSTEM 2000. In this example, SYSTEM 2000 can execute the first condition, but the arithmetic in the second condition is not supported. Therefore, the engine uses where lastname < 'KAP' to filter out as many data records as possible to improve performance. The conditions that are not supported are bypassed by the engine, and post-processing (performed automatically by SAS) will be required after SYSTEM 2000 completes its subsetting. The engine bypasses:
conditions that exceed the 1000-byte limit of a SYSTEM 2000 where-clause. If the WHERE clause in SAS exceeds 1000 bytes, the rightmost portion of the clause is bypassed by SYSTEM 2000.
When the interface view engine first examines the WHERE clause in SAS and determines which conditions SYSTEM 2000 can support, the engine has not yet processed the view descriptor where-clause. Later, when the engine processes the view descriptor where-clause, the possibility arises that the combined length of the WHERE clause conditions in SAS that can be supported in SYSTEM 2000 and the view descriptor where-clause conditions might exceed 1000 bytes.
If the engine determines that SYSTEM 2000 completely supports the WHERE clause in SAS, but also determines that the conditions cannot be combined due to the 1000-byte limit, an unrecoverable error occurs. To the SAS procedure or DATA step, it appears as if the first "read" observation failed. You might need to carefully examine the error messages in the log to find out what actually happened.
Note: If there is no SYSTEM 2000 where-clause included in the view descriptor and no WHERE clause specified in the SAS program, the interface view engine issues a default where-clause in the form of WHERE Cn EXISTS OR Cn FAILS, where Cn is a component in the lowest-level record in the view descriptor.
The default where-clause "WHERE Cn EXISTS OR Cn FAILS" will guarantee that the view will retrieve 100% of the database defined by that view, but it does cause a complete non-key pass of the database. A more efficient default where-clause can be defined by using the following syntax:
*DEFAULT (WHERE valid-subset where-clause) |
As a knowledgeable user of your database, you might be able to define a where-clause using all key components that will still guarantee that you will retrieve 100% of the database defined by this view.
The *DEFAULT where-clause is validated by SYSTEM 2000 at run time. When you specify a SAS WHERE clause, the *DEFAULT is not used. However, when you open a view that has *DEFAULT specified and do not specify a SAS WHERE clause, *DEFAULT is used to qualify the data. The qualified data is passed to the engine for processing by the application or procedure. If a subsequent SAS WHERE clause is specified, the new WHERE clause is the only qualification that is sent to SYSTEM 2000 for retrieval.
In Table A2.1, assume that C114 is a component in the lowest-level record of a view descriptor.
where-clause in SYSTEM 2000 View Descriptor | WHERE Clause in SAS | SYSTEM 2000 Translation | Post-Processing Required? |
---|---|---|---|
C1=A | C2=B OR C3>C4+10 | (C1=A) | Yes |
C1=A | C2=B & C3>C4+10 | (C1=A) & (C2=B) | Yes |
C1=A | C2=B OR C3>C4 | (C1=A) & (C2=B OR C3*>C4*) | No |
C1=A | C2=B & C3 | (C1=A) & (C2=B) | Yes |
--- | --- | C114 EXISTS OR C114 FAILS | No |
--- | C3*20 < C5 | C114 EXISTS OR C114 FAILS | Yes |
--- | C3 = C5 | C3* = C5* | No |
WHERE Clauses in SAS Translatable to SYSTEM 2000 |
Tables A2.2, A2.3, and A2.4 show the interface view engine translations of acceptable WHERE clause conditions in SAS into where-clause conditions in SYSTEM 2000.
WHERE Clause Operators in SAS | SYSTEM 2000 Operators |
---|---|
= | = |
> | > |
< | < |
<> | != |
>= | >= |
<= | <= |
IS NULL | FAILS |
IS NOT NULL | EXISTS |
( | ( |
) | ) |
AND | AND |
OR | OR |
WHERE Clause Syntax in SAS | SYSTEM 2000 Translation |
---|---|
C1 BETWEEN 1 AND 3 | C1 = 1*3 |
C1 IN (4,9,14) | C1=4 OR C1=9 OR C1=14 |
C4 > C5 | C4* > C5* |
C4 = '02AUG87'D | C4 = 08/02/1987 |
SYSTEM 2000 can handle a limited subset of WHERE clause pattern matching specified in SAS, under the following conditions:
The pattern must have a percent sign (%) as the last character.
Underscores (_) are permitted only in the beginning position(s).
The pattern cannot have a percent sign (%) anywhere except in the beginning or in the last position.
The pattern must have some characters that are not percent signs (%) or underscores (_).
WHERE Clause Syntax in SAS | SYSTEM 2000 Translation |
---|---|
C1 LIKE %ABC% | C1 CONTAINS ABC |
C1 LIKE ABC% | C1 CONTAINS ABC IN 1 |
C1 LIKE _ABC% | C1 CONTAINS ABC IN 2 |
C1 LIKE __ABC% | C1 CONTAINS ABC IN 3 |
WHERE Clauses in SAS Not Translatable to SYSTEM 2000 |
Here are some (but not all) WHERE clause conditions in SAS that are not accepted in SYSTEM 2000. They are executed automatically by SAS post-processing:
arithmetic expressions. For example:
WHERE C1 = C4 * 3 WHERE C4 < -C5
expressions in which a variable or combination of variables assumes a value of 1 or 0 to signify true or false. For example,
WHERE C1 WHERE (C1 = C2) * 20
truncated comparison. For example:
C1 =: ABC
DATETIME and TIME formats. For example:
'12:00'T '01JAN60:12:00'DT
references to nulls indicated by a period (.) for numeric variables or closing quotation marks ('') for character variables. Use WHERE C1 IS NULL , do not use WHERE C1 = . or '' to indicate a null. The interface view engine can translate C1 IS NULL into C1 FAILS.
NOT Operator in SAS and SYSTEM 2000 |
The WHERE clause NOT operator in SAS and the where-clause NOT operator in SYSTEM 2000 do not function the same way. If you want NOT to have its SAS meaning, put it in the WHERE clause in SAS. If you want NOT to have its SYSTEM 2000 meaning, put it in the view descriptor where-clause in SYSTEM 2000.
If you specify NOT in a WHERE clause in SAS, NOT is transformed by the WHERE clause parser in SAS; the interface view engine never sees the NOT operator.
WHERE Clause in SAS | What the Interface View Engine Sees |
---|---|
WH NOT LASTNAME = 'Jones'; | WH LASTNAME NE 'Jones'; |
WH NOT LASTNAME > 'Baker'; | WH LASTNAME <= 'Baker'; |
WH NOT (LASTNAME = JONES
AND HIREDATE > '02aug82'd); |
WH LASTNAME NE 'Jones' OR HIREDATE <= '02aug82'd; |
In SYSTEM 2000, the logical converse of wh not lastname = 'Jones'; is wh lastname ne Jones or lastname fails . Before any relational operator can find a match for a value, the value must exist. One reason for this is that nulls are not contained in SYSTEM 2000 indexes, and processing an operator such as NE could be expensive if it were not confined to indexed values.
Copyright © 2007 by SAS Institute Inc., Cary, NC, USA. All rights reserved.