Autopartitioning Scheme for Aster nCluster :: SAS/ACCESS(R) 9.3 for Relational Databases: Reference, Second Edition

Overview

Autopartitioning for SAS/ACCESS Interface to Aster nCluster is a modulo (MOD) function method. For general information about this feature, see Autopartitioning Techniques in SAS/ACCESS.

Autopartitioning Restrictions

SAS/ACCESS Interface to Aster nCluster places additional restrictions on the columns that you can use for the partitioning column during the autopartitioning phase. Here is how columns are partitioned.

SQL_INTEGER, SQL_BIT, SQL_SMALLINT, and SQL_TINYINT columns are given preference.
You can use SQL_DECIMAL, SQL_DOUBLE, SQL_FLOAT, SQL_NUMERIC, and SQL_REAL columns for partitioning under these conditions:
- Aster nCluster supports converting these types to SQL_INTEGER by using the INTEGER cast function.
- The precision minus the scale of the column is greater than 0 but less than 10—namely, 0<(precision-scale)<10.

Nullable Columns

If you select a nullable column for autopartitioning, the

OR<column-name>IS
                     NULL

SQL statement is appended at the end of the SQL code that is generated for the threaded Read. This ensures that any possible NULL values are returned in the result set. Also, if the column to be used for the partitioning is SQL_BIT, the number of threads are automatically changed to two, regardless of DBSLICEPARM= option setting.

Using WHERE Clauses

Autopartitioning does not select a column to be the partitioning column if it appears in a WHERE clause. For example, this DATA step could not use a threaded Read to retrieve the data. All numeric columns in the table are in the WHERE clause:

DATA work.locemp;
    SET trlib.MYEMPS;
    WHERE EMPNUM<=30 and ISTENURE=0 and
       SALARY<=35000 and NUMCLASS>2;
run;

Using DBSLICEPARM=

SAS/ACCESS Interface to Aster nCluster defaults to three threads when you use autopartitioning but do not specify a maximum number of threads in to use for the threaded Read. See DBSLICEPARM= LIBNAME option.

Using DBSLICE=

You might achieve the best possible performance when using threaded Reads by specifying the DBSLICE= data set option for Aster nCluster in your SAS operation. Using DBSLICE= allows connections to individual partitions so that you can configure an Aster nCluster data source for each partition. Use this option to specify both the data source and the WHERE clause for each partition.

proc print data=trilb.MYEMPS(DBSLICE=(DSN1='EMPNUM BETWEEN 1 AND 33'
DSN2='EMPNUM BETWEEN 34 AND 66'
DSN3='EMPNUM BETWEEN 67 AND 100'));
run;

Using the DATASOURCE= option is not required to use DBSLICE= option with threaded Reads.

Using DBSLICE= works well when the table that you want to read is not stored in multiple partitions. It gives you flexibility in column selection. For example, if you know that the STATE column in your employee table contains only a few distinct values, you can customize your DBSLICE= option accordingly.

data work.locemp;
   set trlib2.MYEMP(DBSLICE=("STATE='FL'" "STATE='GA'"
     "STATE='SC'" "STATE='VA'" "STATE='NC'"));
where EMPNUM<=30 and ISTENURE=0 and SALARY<=35000 and NUMCLASS>2;
run;