DISTRIBUTE_ON= Data Set Option

Specifies a column name to use in the DISTRIBUTE ON clause of the CREATE TABLE statement.
Valid in: DATA and PROC steps (when accessing DBMS data using SAS/ACCESS software)
Alias: DISTRIBUTE= [Netezza]
Default: none
Data source: Aster nCluster, Netezza

Syntax

DISTRIBUTE_ON='column-1 <…,column-n>' | RANDOM

Syntax Description

column-name
specifies a DBMS column name.
RANDOM
specifies that data is distributed evenly. For Netezza, the Netezza Performance Server does this across all SPUs. This is known as round-robin distribution.

Details

You can use this option to specify a column name to use in the DISTRIBUTE ON= clause of the CREATE TABLE statement. Each table in the database must have a distribution key that consists of one to four columns. If you do not specify this option, the DBMS selects a distribution key.

Examples

Example 1: Create a Distribution Key on a Single Column

proc sql;
create table netlib.customtab(DISTRIBUTE_ON='partno')
   as select partno, customer, orderdat from saslib.orders;
quit;

Example 2: Create a Distribution Key on Many Columns

For more than one column, separate the columns with commas.
data netlib.mytab(DISTRIBUTE_ON='col1,col2');
col1=1;col2=12345;col4='mytest';col5=98.45;
run;

Example 3: Use the RANDOM Keyword

data netlib.foo(distribute_on=random);
mycol1=1;mycol2='test';
run;