DataFlux Data Management Studio 2.5: User Guide
The Match Score Threshold node has a single parameter: the Minimum Score Threshold. This threshold value is applied to the combined score for the entire match code as specified in the Matchcode Layout node. Any match codes that have scores below this value are discarded.
The threshold value should be set to limit the output of the match definition to a reasonable number of suggestions for the typical input string. Because very low-scoring match codes are likely to be ignored during entity resolution anyway, discarding them at this stage will save resources when the match definition is run in a data job.
The Test Window output of the Match Score Threshold node is very similar to that of the Matchcode Layout node. The match codes are shown prior to obfuscation, together with their scores and the token combination rule from which each arose. If there were duplicate match codes generated (that is, from several token combination rules), this is also indicated. Only the match codes that passed the threshold are shown.
Documentation Feedback: yourturn@sas.com
|
Doc ID: dfDMStd_SBM_Customize_MatchScoreThresholdNode.html |