DataFlux Data Management Studio 2.6: User Guide

Cluster Analysis Node

You can add a Cluster Analysis node to a data job to compare pairs of rows within a single match cluster to determine whether each pair is really a match. Once you have added the node, you can double-click it to open its properties dialog. The properties dialog includes the following elements:

Name - Specifies a name for the node.

Notes - Enables you to open the Notes dialog. You can use the dialog to enter optional details or any other relevant information for the input.

Options - Displays the Options dialog, where you can set the following options:

The Record rules section of the dialog enables you to create rules using expressions for the Cluster Analysis node. These rules are applied to pairs of rows. Each rule should return a positive score. The section includes the following elements:

Passing Score - In combination with the Method for passing setting, determines whether a row pair should be considered a match (pass) or not (fail). A row is presented on output only if it passes.

Method for passing - Defines how the passing score will be interpreted and how individual rule scores will be used when determining if rows in a pair are considered a match (pass) or not (fail). Select one of the following options from the drop-down list:

Record rules table - Displays the current record rules.

Add - Displays the Record Rule dialog. You can use the dialog to create new rules, which you must name and designate as either Basic or Advanced. Use the drop-down menus in the columns to perform the following functions:

You can use the Add and Delete buttons to maintain the list of record rules.

Edit - Enables you to modify the record rules.

The Output fields section of the dialog includes the following elements:

Available - Displays the fields that you can make available for the next step in your data job. Items displayed in this list are dependent on your data sources and any preceding steps in your data job.

Selected - Displays the fields that will be made available to the next node in your data job.

You can access the following advanced properties by right-clicking the Cluster Analysis node:

Documentation Feedback: yourturn@sas.com
Note: Always include the Doc ID when providing documentation feedback.

Doc ID: dfU_PFInt_ClustAnalysis.html