DataFlux Data Management Studio 2.5: User Guide
You can add a Surviving Record Identification node to a data job to examine clustered data and determine a surviving record for each cluster. This surviving record identification (SRI) process lets you eliminate duplicate information in a data source. The surviving record is identified using one or more user-configurable record rules. Once you have added the node, you can double-click it to open its properties dialog. The properties dialog includes the following elements:
Name - Specifies a name for the node.
Notes - Enables you to open the Notes dialog. You use the dialog to enter optional details or any other relevant information for the input.
Cluster ID Field - Specifies the input field that contains the cluster identifier for the incoming data.
Options - Displays the Options dialog, where you can set the following options:
The Record rules section of the dialog contains rules that are used to determine which record in the cluster should be chosen as the surviving record. The section includes the following elements:
Record rules table - Displays the current record rules.
Add - Displays the Add Record Rule Expression dialog, which enables you to add record rules.
Edit - Enables you to modify a record rule.
Stop processing after first rule yields records - When selected, stops processing if the first rule generates surviving record results. This option processes sub-rules that accompany the first rule.
Note that rules prefer values from the surviving record row over the first matching row if the surviving record row is part of the result set.
The Output fields section of the dialog includes the following elements:
Available - Displays the fields that you can make available for the next step in your data job. Items displayed in this list are dependent on your data sources and any preceding steps in your data job.
Selected - Displays the fields that will be made available to the next node in your data job.
Field Rules - Enables you to access the Field Rules dialog, where you can create and maintain field rules that match rules and expressions. You can click Add to access the Add Field Rule dialog, which enables you to select and build field rule expressions that consist of fields and conditions that govern their behavior. These field rules can be constructed in the Add Field Rule Expression dialog.
Note: If fields examined by “Minimum” or “Shortest” functions contain NULL values, the first such field/row will always be selected (NULL value will be used over non-NULL data).
These rules are used to determine which value from all of the cluster record values for one or more given fields should be assigned to the field in the surviving record. Note that the updated values are passed along as part of the node's output in the data flow. It is up to a subsequent node in the data flow to do something meaningful with these values.
You can access the following advanced properties by right-clicking the Surviving Record Identification node:
Documentation Feedback: yourturn@sas.com
|
Doc ID: dfU_PFInt_SurvRecdID.html |