DataFlux Data Management Studio 2.7: User Guide

Language Guess Definitions

A Language Guess definition guesses the language from which a string might originate. This might be used as a first step before applying other processing; for example, running definitions from the guessed language on the string. Logically, this type of definition should be created in a language rather than a locale.

At least one N-Gram Scheme Node or Regex Library Node must exist for the definition to be valid.

Input: a string

Example:

"28 Rue des Halles"

Output: a confidence score, indicating how likely it is that the input belongs to the definition's language.

Nodes

Hierarchy Node/Group Container Group Count
1 Language Guess Definition Head Node   1
2 Preprocessing Regex Library Group   1
2.1 Preprocessing Regex Library Node Preprocessing Regex Library Group 0 or more
3 N-Gram Analysis Group   1
3.1 N-Gram Scheme Node N-Gram Analysis Group 0 or more
4 Regex Search Group   1
4.1 Regex Library Node Regex Search Group 0 or more
5 Scoring Node   1

Documentation Feedback: yourturn@sas.com
Note: Always include the Doc ID when providing documentation feedback.

Doc ID: DMCust_12600.html