DataFlux Data Management Studio
You can add a Concurrent Real Time Clustering (CRTC) node to a data job to use a cluster state file to encapsulate cluster information that is held in memory during processing. The node is very similar to the Exclusive Real Time Clustering (ERTC) node. The difference between these nodes is that while the ERTC node interacts directly with the cluster state file, the CRTC node interacts with a server that interacts with the cluster state file. This makes the CRTC node slower but again, a service on a DataFlux Data Management Server or even a batch job using the ERTC node cannot be called by more than one user at a time without receiving an error. So in this scenario, the CRTC node must be used.
Once you have added the Concurrent Real Time Clustering node, you can double-click it to open its properties dialog. The properties dialog includes the following elements:
Name - Specifies a name for the node.
Notes - Enables you to open the Notes dialog. You use the dialog to enter optional details or any other relevant information for the input.
Output Cluster ID Field - Enables you to enter a name prefix for collapsed cluster ID fields. If no name is set, this information will not be available on output.
Options - Displays the Options dialog. The dialog includes the following elements:
Collapsed Cluster ID Field Prefix - Enables you to enter a name prefix for collapsed cluster ID fields. If no name is set, this information will not be available on output.
Transaction frequency - Specifies the transaction frequency of the clustering process. You can set the frequency to include transactions for every row, designate a transaction frequency for a specified number of rows, or set the frequency to include all of the rows in a single transaction.
The State file section of the dialog includes the following elements:
Backup Path - Enables you to specify a backup path. You can click Browse to navigate to the path.
Backup Prefix - Specifies a prefix to be used when generating state file backups. When initializing the new RTC, you can supply the same prefix (after the path to the state backup files.)
Backup Frequency - Specifies the frequency of backups. When this value is set to 0, backup is done at shutdown only.
Save old State Files - When selected, specifies that old state backup files should not be deleted after the latest state is successfully saved.
The Conditions section of the dialog includes the following elements:
Available - Displays the fields that you can make available for the next step in your data job. Items displayed in this list are dependent on your data sources and any preceding steps in your data job.
Selected - Displays the fields that will be made available to the next node in your data job. You can click OR to add an OR condition the bottom of the Selected Fields list.
Additional Outputs -Displays the Additional Outputs dialog. This dialog enables you to specify the fields that you can make available to the next node in your data job.
You can access the following advanced properties by right-clicking the Concurrent Real Time Clustering node:
Documentation Feedback: yourturn@sas.com
|
Doc ID: dfU_PFInt_ConcurRT_Clustering.html |