SAS Quality Knowledge Base for Contact Information 27

Name (with Suggestions)

Match Definition

Name (with Suggestions)
Description The Name (with Suggestions) match definition generates match codes which can be used to cluster records containing names of individuals.
Max Length of Match Code 26 characters

Examples

Input Cluster ID
HERMANN BORSCH 1
HERKANN BORSCH 1
HENRY NICKELSON 2
HENRY NICKERSON 2
PAUL HEIDEN 3
PAUL HEIDE 3
PAUL HEIDER 3
PAUL HEIDNER 4
PAUL HEIDER 4
Remarks

Note Note: The results listed above reflect the default match sensitivity (85).

This definition generates one or more match codes for each input string. Each match code represents a suggestion for what might be the true value of the input string; this enables two strings to be matched even when one or both strings contain a spelling mistake. For example, the name HERKANN might match the name HERMANN.

Note that a consequence of the generation of multiple match codes is that a record might be placed in more than one cluster by a subsequent clustering operation. Therefore, special attention should be given to the entity resolution process when using this definition.

Another consequence of the generation of multiple match codes is that more processing time is required than when generating a single match code. Generation of match codes using this definition might take up to five times as long as generation of match codes using a traditional match definition.

For more information on suggestion-based matching, refer to the Suggestion-Based Matching section of the DataFlux Data Management Studio Online Help.