Previous Page | Next Page

Data Set Options for Relational Databases

DISTRIBUTED_BY= Data Set Option



Uses one or multiple columns to distribute table rows across database segments.
Default value: RANDOMLY DISTRIBUTED
Valid in: DATA and PROC steps (when accessing DBMS data using SAS/ACCESS software)
DBMS support: Greenplum

Syntax
Syntax Description
Details
Example

Syntax

DISTRIBUTED_BY='column-1 <... ,column-n>' | RANDOMLY DISTRIBUTED

Syntax Description

column-name

specifies a DBMS column name.

DISTRIBUTED RANDOMLY

determines the column or set of columns that the Greenplum database uses to distribute table rows across database segments. This is known as round-robin distribution.


Details

For uniform distribution--namely, so that table records are stored evenly across segments (machines) that are part of the database configuration--the distribution key should be as unique as possible.


Example

This example shows how to create a table by specifying a distribution key.

libname x sasiogpl user=myuser password=mypwd dsn=Greenplum;

data x.sales (dbtype=(id=int qty=int amt=int) distributed_by='distributed by (id)');
          id = 1;
          qty = 100;
          sales_date = '27Aug2009'd;
          amt = 20000;
run;

It creates the SALES table.

CREATE TABLE SALES 
(id int,
 qty int,
 sales_date double precision,
 amt int
) distributed by (id)

Previous Page | Next Page | Top of Page