SPD Server SQL gives
you the ability to use quick index scans on large tables. Rather than
scan entire tables sequentially, which can have million or billions
of rows, SPD Server SQL can scan cached index metadata. SPD Server
SQL provides enhanced index scan support for the following functions:
min, max, count, nmiss, range uss, css, std, stderr, and var. All
of the functions can accept the DISTINCT term.
All the index scan capabilities
are available for both standard SPD Server tables and clustered tables,
with the exception of the DISTINCT qualifier. The DISTINCT index scan
function is not available in clustered tables.
The count(*) function
is the only function that is included with the index scan support
that does not require an index on the table. For example, consider
the following code:
select count(*) from tablename;
This code returns the
number of rows in the large table
tablename
without
performing a row scan of the table. Table metadata returns the correct
number of rows. As a result, the response is as fast as an index scan,
even on an unindexed table.
count(*) functions with
WHERE clauses require an index for each column referenced in the WHERE
clause in order for the index scan feature to improve performance.
For example, suppose SPD Server table Foo has indexes on numeric columns
a and b. The following count(*) functions benefit from SPD Server
index scan support:
select count(*)
from Foo
where a = 1;
select count(*)
from Foo
where a LT 4
and b EQ 5;
select count(*)
from Foo
where a in (2,4,5)
or b in (10,20,30);
All functions other
than count(*) require an index on function columns in order to exploit
the index scan performance savings. Minimal WHERE clause support is
available for these queries, as long as all functions use the same
column, and the WHERE clause is a simple clause that uses the LT,
LE, EQ, GE, GT, IN, or BETWEEN operator for that column. For example,
suppose that the SPD Server table Bar has indexes on numeric columns
x and y. The following SQL submissions exploit the performance gains
of index scans:
select min(x),
max(x),
count(x),
nmiss(x),
range(x),
count(distinct x)
from Bar;
select min(x),
max(x),
count(x),
nmiss(x),
range(x),
count(distinct x)
from Bar
where x between 5 and 10;
select min(x),
max(x),
count(x),
nmiss(x),
range(x),
count(distinct x)
from Bar
where x gt 100;
select min(x),
min(y),
count(x),
count(y)
from Bar;
If any one function
in a statement does not meet the index scan criteria, all functions
in that statement revert to being resolved by table scan instead of
index scan. Suppose the user-named SPD Server table Oops has indexes
on numeric columns x and y. Column z is not indexed. The following
SPD Server SQL statement is entirely evaluated by table scan; index
scanning is not performed on any of the functions.
select min(x),
min(y),
count(x),
count(y),
count(z)
from Oops;
To take advantage of
index scans, you could resubmit the previous statement in the following
way:
select min(x),
min(y),
count(x),
count(y)
from Oops;
select count(y)
from Bar;
The functions min(x),
min(y), count(x), and count(y) are evaluated using index scan metadata
and exploit the performance gains. The function count(y) continues
to be evaluated by table scan. You can combine the count(*) function
with other functions and benefit from index scan performance gains.
For the SPD Server table Oops with indexes on numeric columns x and
y, the following SPD Server SQL statement benefits from index scan
performance:
select min(x),
range(y),
count(x),
count(*)
from Oops;
SPD Server Index Scan
is an extension of the SPD Server Parallel Group-By facility. The
query must first be accepted by the Parallel Group-By facility to
be evaluated for an Index scan. For more information, see
Parallel Group-By Facility . When SPD Server uses the Index Scan optimization, the
following message is printed to the SAS log:
SPDS_NOTE: Metascan used to resolve this query.