SPD Server SQL provides
users with the capability to use lightning-fast index scans on large
tables. Rather than scanning entire tables, which can have million
or billions of rows, SPD Server SQL is able to scan cached index metadata
instead of sequentially scanning entire large tables. SPD Server
SQL provides enhanced index scan support for the following functions:
min, max,
count, nmiss, range uss, css, std, stderr,and
var. All of the functions can accept the DISTINCT term
as well.
All index scan capabilities
listed above are available for both standard SPD Server tables as
well as clustered tables, with the exception of the DISTINCT qualifier.
The DISTINCT index scan function is not available in clustered tables.
The count(*) function
is the only function included with the index scan support enhancement
that does not require an index on the table. For example,
select count(*) from tablename;
returns the number of
rows in the large table tablenamewithout performing a row scan of
the table. Table metadata is able to return the correct number of
rows. As a result, the response is as fast as an index scan, even
on an unindexed table in this case.
Count(*) functions with
WHERE clauses require an index for each column referenced in the WHERE
clause, in order for the index scan feature to provide the performance
enhancement. For example, suppose SPD Server table Foo has indexes
on numeric columns a and b. The following count(*) functions benefit
from SPD Server index scan support:
select count(*)
from Foo
where a = 1;
select count(*)
from Foo
where a LT 4
and b EQ 5;
select count(*)
from Foo
where a in (2,4,5)
or b in (10,20,30);
All functions other
than count(*) require an index on function columns in order to exploit
the index scan performance savings. Minimal WHERE clause support is
available for these queries, as long as all functions use the same
column, and the WHERE clause is a simple clause that uses the LT,
LE, EQ, GE, GT, IN, or BETWEEN operator for that column. For example,
suppose that the SPD Server table Bar has indexes on numeric columns
x and y. The following SQL submissions are able to exploit the performance
gains of index scans:
select min(x),
max(x),
count(x),
nmiss(x),
range(x),
count(distinct x)
from Bar;
select min(x),
max(x),
count(x),
nmiss(x),
range(x),
count(distinct x)
from Bar
where x between 5 and 10;
select min(x),
max(x),
count(x),
nmiss(x),
range(x),
count(distinct x)
from Bar
where x gt 100;
select min(x),
min(y),
count(x),
count(y)
from Bar;
If any one function
in a statement does not meet the index scan criteria, all functions
in that statement revert to being resolved by table scan instead of
index scan. Suppose the SPD Server table Oops has indexes on numeric
columns x and y. Column z is not indexed. Then, the SPD Server SQL
statement below
select min(x),
min(y),
count(x),
count(y),
count(z)
from Oops;
is entirely evaluated
by table scan; index scanning is not performed on any of the functions.
To take advantage of index scans, the statement above could be resubmitted
as
select min(x),
min(y),
count(x),
count(y)
from Oops;
select count(y)
from Bar;
The functions min(x),
min(y), count(x), and count(y) are evaluated using index scan metadata
and exploit the performance gains. The function count(y) continues
to be evaluated by table scan. The count(*) function can be combined
with other functions and benefit from index scan performance gains.
Continuing to use the SPD Server table Oops with indexes on numeric
columns x and y, the following SPD Server SQL statement benefit from
index scan performance:
select min(x),
range(y),
count(x),
count(*)
from Oops;
SPD Server Index Scan
is an extension to the SPD Server Parallel Group-By facility. The
query must first be accepted by Parallel Group-By to be evaluated
for an Index Scan. The section on
Parallel Group-By Facility contains more detailed information. When SPD Server uses
the Index Scan optimization, the following message is printed to the
SAS log:
SPDS_NOTE: Metascan used to resolve this query.