DataFlux Data Management Studio 2.6: User Guide

Entity Resolution File Output Node

You can add an Entity Resolution File Output node to a data job to write clustered data to a DataFlux Data Management Studio Entity Resolution file for use in DataFlux Data Management StudioEntity Resolution. Because the Entity Resolution File Output step works only with clustered data, the incoming records must have a field containing a numeric cluster identifier and must be sorted by that cluster identification field.

Once you have added the node, you can double-click it to open its properties dialog. The properties dialog includes the following elements:

Name - Specifies a name for the node.

Notes - Enables you to open the Notes dialog. You use the dialog to enter optional details or any other relevant information for the node.

Cluster ID field - Enables you to select the input field that contains the numeric cluster identifier.

Source table - Enables you to select a data source for the Entity Resolution file information. You can also click Browse to navigate to the database table. The Entity Resolution file information is usually the job's data source, which will be selected as the default.

Output file - Enables you to enter the path and name for the Entity Resolution output file. You can also click Browse to specify a location for the output file. When you use the Browse button, the path that you select is relative to the current server and will be maintained if you move to another server. You can set an absolute path manually by entering it in the Output file field. Note that you must also change this absolute path if you change the server location.

Display file after job runs - When selected, launches Entity Resolution with the contents of the Entity Resolution file when the job finishes running.

Options - Enables you to set output file options. You can display a single-record cluster, specify the surviving record ID field, and ensure that the surviving record ID field contains the primary key value. You can also specify that field data is embedded in the output file. You must embed field data when you specify a text file as an input to the job that creates the Entity Resolution file (as documented in Create an Entity Resolution Data Job).

Edit Fields Settings - Enables you to access the Edit Fields Settings dialog. After you apply field rules for the surviving record identification step, you might need to preserve some content from the records that will be purged. For example, you might want to delete records that are blank for the name field. The Edit Field Settings dialog enables you to select the fields that are preserved.

Adding a field to the selected fields list tells the node that when it runs it should insert the field's value received via the data job into the SRI file. The value is used in the surviving record for the cluster. The value replaces the value that exists for the field in the source database. (Note that they can be identical.)

Target Settings - Enables you to access the Target Settings dialog. You can use the dialog to configure the settings for the target type action: source table, flat file, generate audit file only, and source table or flat file and logically delete records. You can also select the data removal type.

The Primary keys section of the dialog includes the following elements:

Available - Displays the fields that you can make available for the next step in your data job. Items displayed in this list are dependent on your data sources and any preceding steps in your data job.

Selected fields - Displays the fields that will be made available to the next node in your data job. If desired, click in the Output Name fields to edit the names of any fields for the next step.

The Output fields section of the dialog includes the following elements:

Available - Displays the fields that you can make available for the next step in your data job. Items displayed in this list are dependent on your data sources and any preceding steps in your data job.

Selected - Displays the fields that will be made available to the next node in your data job.

You can also access the following advanced properties by right-clicking Entity Resolution File Output node in the data job:

Documentation Feedback: yourturn@sas.com
Note: Always include the Doc ID when providing documentation feedback.

Doc ID: dfU_PFOutput_Merge.html