Sub-Clustering Node

DataFlux Data Management Studio 2.5: User Guide

Sub-Clustering Node

You can add a Sub-Clustering node to a data job. Once you have added the node, you can double-click it to open its properties dialog. The properties dialog includes the following elements:

Name - Specifies a name for the node.

Notes - Enables you to open the Notes dialog. You use the dialog to enter optional details or any other relevant information for the input.

Cluster ID field - Enables you to select the input field that contains the numeric cluster identifier.

Primary Key 1 field - Enables you to select the first primary key field.

Primary Key 2 field - Enables you to select the second primary key field.

Sort output by cluster number - When selected, sorts each cluster by the Cluster ID.

Compact cluster numbers - When selected, removes spaces in cluster numbers.

Memory to use for processing each cluster - Specifies the amount of memory to use per cluster. The memory configured for this node is split between clustering and storage libraries. The default is 16 MB. If each cluster has a small number of rows, this value should be kept low. If you have many clusters with thousands of rows each, you should increase the memory setting to improve performance.

The Output fields section of the dialog includes the following elements:

Available - Displays the fields that you can make available for the next step in your data job. Items displayed in this list are dependent on your data sources and any preceding steps in your data job.

Selected - Displays the fields that will be made available to the next node in your data job.

Cluster number output field - Specifies the field name for the cluster number.

You can access the following advanced properties by right-clicking the Sub-Clustering node:

Documentation Feedback: yourturn@sas.com
Note: Always include the Doc ID when providing documentation feedback.

Doc ID: dfU_PFInt_SubClust.html