Previous Page | Next Page

SAS/ACCESS Interface to Aster nCluster

Autopartitioning Scheme for Aster nCluster


Overview

Autopartitioning for SAS/ACCESS Interface to Aster nCluster is a modulo (MOD) function method. For general information about this feature, see Autopartitioning Techniques in SAS/ACCESS.


Autopartitioning Restrictions

SAS/ACCESS Interface to Aster nCluster places additional restrictions on the columns that you can use for the partitioning column during the autopartitioning phase. Here is how columns are partitioned.


Nullable Columns

If you select a nullable column for autopartitioning, the OR<column-name>IS NULL SQL statement is appended at the end of the SQL code that is generated for the threaded read. This ensures that any possible NULL values are returned in the result set. Also, if the column to be used for the partitioning is SQL_BIT, the number of threads are automatically changed to two, regardless of DBSLICEPARM= option setting.


Using WHERE Clauses

Autopartitioning does not select a column to be the partitioning column if it appears in a WHERE clause. For example, this DATA step could not use a threaded read to retrieve the data. All numeric columns in the table are in the WHERE clause:

DATA work.locemp;
    SET trlib.MYEMPS;
    WHERE EMPNUM<=30 and ISTENURE=0 and
       SALARY<=35000 and NUMCLASS>2;
run;


Using DBSLICEPARM=

SAS/ACCESS Interface to Aster nCluster defaults to three threads when you use autopartitioning but do not specify a maximum number of threads in to use for the threaded read. See DBSLICEPARM= LIBNAME Option.


Using DBSLICE=

You might achieve the best possible performance when using threaded reads by specifying the DBSLICE= Data Set Option for Aster nCluster in your SAS operation. Using DBSLICE= allows connections to individual partitions so that you can configure an Aster nCluster data source for each partition. Use this option to specify both the data source and the WHERE clause for each partition.

proc print data=trilb.MYEMPS(DBSLICE=(DSN1='EMPNUM BETWEEN 1 AND 33'
DSN2='EMPNUM BETWEEN 34 AND 66'
DSN3='EMPNUM BETWEEN 67 AND 100'));
run;

Using the DATASOURCE= option is not required to use DBSLICE= option with threaded reads.

Using DBSLICE= works well when the table you want to read is not stored in multiple partitions. It gives you flexibility in column selection. For example, if you know that the STATE column in your employee table contains only a few distinct values, you can tailor your DBSLICE= option accordingly.

data work.locemp;
   set trlib2.MYEMP(DBSLICE=("STATE='FL'" "STATE='GA'" 
     "STATE='SC'" "STATE='VA'" "STATE='NC'"));
where EMPNUM<=30 and ISTENURE=0 and SALARY<=35000 and NUMCLASS>2;
run;

Previous Page | Next Page | Top of Page