You are here: Customizing Quality Knowledge Bases>Overview of Customize Features>Nodes>Token Mappings Node

DataFlux Data Management Studio 2.5: User Guide

Token Mappings Node

The Token Mappings Node receives pairs of substrings and categories. The categories determine the assignment of substrings to tokens.

The output of the node is a list of tokens and token strings. Token strings are concatenations of the substrings that have been assigned to each token by category.

Used in:

Properties

Parse Definition

Start in the Solution-Based tab, and then add details as needed in the Default tab.

Solution-Based tab

Use the Solution-Based tab to add, edit, reorder, or delete a list of categories for each token. Click a token to display its categories. Use the icons on the right to make changes. All input substrings with matching categories are added to the output string for the selected token.

For example, in the context of name-parsing, a category called Middle Name Word (MNW) could be mapped to a token called Middle Name. A category called Middle Name Initial (MNI) could also be mapped to the Middle Name token. All substrings that have been classified as MNW or MNI will be added to the token string that is output with the Middle Name token.

Click Use Separator to remove the separator that is applied between substrings. To specify a different separator, click a token and click the edit icon. The default separator string is a single blank space.

Default tab

Use the Default tab to refine the content of your token strings.

Under Mapping, click a token and click the plus icon to specify a maximum number of words for a category. You can specify different numbers for different categories.

Click Default Token to change the token that receives all input substrings that are not assigned to tokens. The default value (None) discards substrings that are not assigned to tokens.

Under Exclude Categories, specify the categories that will be prevented from appearing in the node output. No categories are excluded by default. Typically excluded categories are those such as COMMA and DASH that apply to punctuation marks.

Extraction Definition

Use token separator

If the Use token separator check box is selected, the displayed token separator will be inserted between substrings. The default separator is a semi-colon (;). Click in the separator field to change the separator.

Default Token

In the Default Token field, select a token that receives unassigned substrings. Available selections include tokens, an Additional Info token, and a value of (None). Select (None) to discard unassigned substrings. Additional Info is selected by default.

Output

Results

Table with two columns:

Documentation Feedback: yourturn@sas.com
Note: Always include the Doc ID when providing documentation feedback.

Doc ID: dfU_Cstm_12328.html