Data Set Options for Relational Databases

DISTRIBUTE_ON= Data Set Option

Specifies a column name to use in the DISTRIBUTE ON clause of the CREATE TABLE statement.

Alias:	DISTRIBUTE= [Netezza]
Default value:	none
Valid in:	DATA and PROC steps (when accessing DBMS data using SAS/ACCESS software)
DBMS support:	Aster nCluster, Netezza

Syntax
Syntax Description
Details
Examples

Syntax

DISTRIBUTE_ON='column-1 <... ,column-n>' | RANDOM

Syntax Description

column-name: specifies a DBMS column name.
RANDOM: specifies that data is distributed evenly. For Netezza, the Netezza Performance Server does this across all SPUs. This is known as round-robin distribution.

Details

You can use this option to specify a column name to use in the DISTRIBUTE ON= clause of the CREATE TABLE statement. Each table in the database must have a distribution key that consists of one to four columns. If you do not specify this option, the DBMS selects a distribution key.

Examples

This example uses DISTRIBUTE_ON= to create a distribution key on a single column.

proc sql;
create table netlib.customtab(DISTRIBUTE_ON='partno') 
   as select partno, customer, orderdat from saslib.orders;
quit;

To create a distribution key on more than one column, separate the columns with commas.

data netlib.mytab(DISTRIBUTE_ON='col1,col2');
col1=1;col2=12345;col4='mytest';col5=98.45;
run;

This next example shows how to use the RANDOM keyword.

data netlib.foo(distribute_on=random);
mycol1=1;mycol2='test';
run;

Top of Page