Syntax for the SPD Engine
The SPD Engine is a new Version 9 libname engine and therefore follows the standard libname syntax. There are several new
options that can be used to tune performance when using the SPD Engine. The following sections list the new as well as
existing libname, data set, and system options that can be used to tune performance of the SPD Engine.
LIBNAME libref SPDE (SPDE options);
where SPDE options include:
-
bysort - instructs the SPD Engine to perform an automatic implicit sort when it encounters a BY statement for
processing data in the library unless the data is indexed on the BY variable. Default is YES.
-
datapath - specifies the directories for the partition data files (.DPF) for an SPD Engine data set.
-
endobs - specifies the end observation number in a user-defined range of observations that are qualified with a
WHERE expression.
-
indexpath - specifies the directories for the index files (.HBX and .IDX) associated with an SPD Engine data set.
-
metapath - specifies the overflow locations for the metadata files (.MDF) for an SPD Engine data set. The
METAPATH= option is specified for space that is exclusively overflow space for the metadata component file. The metadata
component file for each data set must begin in the primary path, and the overflow occurs to the METAPATH= location when
that file is full.
-
partsize - specifies the size (in megabytes) of each data file partition (.DPF). This specification applies only
to partitions in the data component files. Default is 128.
-
startobs - specifies the starting observation number in a user-defined range of observations that are qualified
with a WHERE expression.
-
temp - specifies whether to create a temporary subdirectory of the directory that is specified in the LIBNAME
statement, and to delete it at the end of the SAS session. Default is NO.
The following data set options can also be used to tune performance of the SPD Engine.
-
asyncindex - instructs the SPD Engine to create multiple indexes in parallel. Default is NO.
-
bynoequals - specifies whether the output order of data set observations with identical values for the BY variable
are guaranteed to be in data set order. Default is NO.
-
bysort - instructs the SPD Engine to perform an automatic implicit sort when it encounters a BY statement for
processing a data set, unless the data is indexed on the BY variable, or is already sorted. Default is YES.
-
compress - controls whether the SPD Engine compresses the data file. Compressing a data set usually reduces its
size (and therefore I/O) at the expense of some added CPU cost. Default is NO.
-
endobs - specifies the end observation number in a user-defined range of observations to be processed.
-
idxwhere - instructs the SPD Engine to use indexes when processing a WHERE expression. Default is YES.
-
ioblocksize - specifies the number of observations in a block to be stored in or read from an SPD Engine component
file that is compressed. Default 4096.
-
padcompress - specifies a number of bytes to add to compression blocks in a data set opened for UPDATE. Default is
0.
-
partsize - specifies, when an SPD Engine data set is created, the size (in megabytes), that the data component
partitions must be. This is a fixed-length size. This specification applies only to partitions in the data component
files. Default 128.
-
segsize - specifies the number of observations represented by an index component file segment. Default 8192.
-
startobs - specifies the starting observation number in a user-defined range of observations to be processed.
-
syncadd - specifies to process one observation at a time or multiple observations at a time. Default is NO.
-
threadnum - specifies the number of threads to be used for processing an SPD Engine data set. Default is the value
of SPDEMAXTHREADS if set; otherwise default is 2 times the number of CPUs on your machine.
-
uniquesave - specifies to save observations with non-unique key values (the rejected observations) to a separate
data set when appending or inserting observations to data sets with unique indexes. Default is NO.
-
wherenoindex - specifies a list of indexes to be excluded from WHERE processing.
The following SAS system options can be used to tune performance of the SPD Engine. These options can be specified in the
configuration file, SAS invocation, OPTIONS statement, or in the System Options window unless otherwise noted.
-
compress - specifies to compress the SPD Engine data sets on disk as they are being created. Default is NO.
-
maxsegratio - controls, when evaluating a WHERE expression for processing with indexes, whether to perform segment
candidate pre-evaluation. Default 75.
-
minpartsize - specifies a minimum partition size to use for creating SPD Engine data sets. Valid in cofig file and
SAS invocation only. Default 0.
-
spdeindexsortsize - specifies the maximum amount of memory used by the sort when creating an index. When indexes
are created in parallel (because ASYNCINDEX=YES), the SPDEINDEXSORTSIZE value is divided up among all the concurrent
index creation threads. Default 32M.
-
spdemaxthreads - sets the upper limit on the number of threads the SPD Engine is allowed to use. In a computer
shared by multiple users, it may be important to limit the number of threads used by the SPD Engine to avoid using too
many CPUs. SPDEMAXTHREADS= constrains the THREADNUM= data set option. Valid in config file and SAS invocation only.
Default 0.
-
spdesortsize - specifies the maximum amount of memory used by the sort. Note that there may be multiple sorts
executing in parallel. So the real amount of memory used by sorts is SPDESORTSIZE multiplied by the number of concurrent
sorts. Default 32M.
-
spdeutilloc - specifies one or more file system locations in which the SPD Engine can temporarily store utility
files.
-
spdewheval - specifies the process used to determine which observations meet the condition(s) of a WHERE
expression. Default is COST.