Understanding SPD Server Indexing

Overview of Indexing

A significant strength of SPD Server is efficient creation, maintenance, and use of table indexes. Indexing can greatly speed the evaluation of WHERE clause queries. Indexes can be a source of sort order when performing BY clause processing. Indexes are also used directly by some SAS applications. For example, PROC SQL uses indexes to efficiently evaluate equijoins.

The server supports indexes for queries that require global table views (such as queries that contain BY clause processing or SQL joins) and segmented views (such as parallel processing of WHERE clause statements).

The server can thread WHERE clause evaluations for tables that are not indexed. However, indexes enable rapid WHERE clause evaluations. You should index large tables to optimize server performance. For information about indexing with WHERE, see Server Indexing with WHERE Clause.

Index creation is a CPU-intensive process. When sufficient processing power is available, parallel index creation in the server is highly scalable. The creation process for each index is threaded. A single index creation can use multiple CPUs on a server if they are available, which greatly improves performance. The server efficiently indexes tables of varying size and data distributions.

There are several ways to define indexes on table columns.

Parallel Index Creation

SPD Server supports parallel index creation using asynchronous index options. To enable asynchronous parallel index creation, either submit the SPDSIASY=YES macro variable before creating an index in SAS, or use the ASYNCINDEX=YES table option.

Both the macro variable and the table option apply to the DATA step INDEX= processing as well as to PROC DATASETS INDEX CREATE statements. Either method allows all of the declared indexes to be populated with a single scan of the table. A single scan is a substantial improvement over making multiple passes through the data to build each index serially.

Note: To create multiple indexes requires enough WORKPATH= disk space to create all of the key sorts at the same time. Consult with your server administrator to find out if you have enough disk space.

PROC DATASETS has the flexibility to allow batched parallel index creation by using multiple MODIFY groups. Parallel Index Creation inserts INDEX CREATE statements between two successive MODIFY statements resulting in a parallel creation group.

Parallel Index Updates

SPD Server also supports parallel index updates during table append operations. Multiple threads enable overlap of data transfer to the proxy, as well as updates of the data store and index files. The server decomposes table append operations into a set of steps that can be performed in parallel. The level of parallelism attained depends on the number of indexes that are present on the table. The more indexes you have, the greater the exploitation of parallelism during the append processing. As with parallel index creation, parallel index updates use WORKPATH= disk space for the key sorts that are part of the index append processing.

Last updated: February 8, 2017