Use the SPDSSIZE= macro
variable to specify the size of an SPD Server table partition.
Syntax
SPDSSIZE=n
Default: 16
MB for domains that are not Hadoop domains, 128 MB for Hadoop domains
Corresponding
Table Option: PARTSIZE=
Affected
by LIBNAME Option: DATAPATH=
Arguments
n
is the size of the
partition in megabytes.
Description
Use this SPDSSIZE= macro
variable option to improve performance of WHERE clause evaluation
on non-indexed table columns.
Splitting the data portion
of a server table at fixed-sized intervals allows SPD Server to introduce
a high degree of scalability for non-indexed WHERE clause evaluation.
This is because SPD Server launches threads in parallel and can evaluate
different partitions of the table without file access or thread contention.
The speed enhancement comes at the cost of disk usage. The more data
table splits you create, the more you increase the number of files,
which are required to store the rows of the table.
Scalability limits on
the SPDSSIZE= macro variable ultimately depend on how you structure
the DATAPATH= option in your LIBNAME statement. The configuration
of the DATAPATH= file systems across striped volumes is important.
You should spread each individual volume's striping configuration
across multiple disk controllers and SCSI channels in the disk storage
array. Your configuration goal, at the hardware level, should be to
maximize parallelism when performing data retrieval.
The SPDSSIZE= specification
is also limited by MINPARTSIZE=, an SPD Server parameter maintained
by the SPD Server administrator. MINPARTSIZE= ensures that an over-zealous
SAS user cannot arbitrarily create small partitions, thereby generating
an excessive number of physical files. The default for MINPARTSIZE=
is 16 MB for domains that are not Hadoop domains, and 128MB for Hadoop
domains.
If you use SPDSSIZE=
to specify a partition size for a non-Hadoop domain, the value of
SPDSSIZE= must be greater than the value declared for MINPARTSIZE
in the SPD Server parameter file in order to have any effect.
If you use SPDSSIZE=
to specify a partition size for a Hadoop domain, the value of SPDSSIZE=
must be larger than the greater of the value declared for MINPARTSIZE
or 128 MB in order to have any effect.
Note: The
SPDSSIZE= value for a table cannot be changed after the table is created.
To change the SPDSSIZE=, you must PROC COPY the table and use a different
SPDSSIZE= (or PARTSIZE=) option setting on the new (output) table.
For an example using
the table option, see
PARTSIZE=.