DataFlux Data Management Studio 2.7: User Guide

Token Mappings Node

The Token Mappings Node receives pairs of substrings and categories. The categories determine the assignment of substrings to tokens.

The output of the node is a list of tokens and token strings. Token strings are concatenations of the substrings that have been assigned to each token by category.

Used in:

Properties

Parse Definition

Start in the Solution-Based tab, and then add details as needed in the No Solution tab.

Solution-Based tab

Use the Solution-Based tab to add, edit, reorder, or delete a list of categories for each token. Click a token to display its categories. Use the icons on the right to make changes. All input substrings with matching categories are added to the output string for the selected token.

For example, in the context of name-parsing, a category called Middle Name Word (MNW) could be mapped to a token called Middle Name. A category called Middle Name Initial (MNI) could also be mapped to the Middle Name token. All substrings that have been classified as MNW or MNI will be added to the token string that is output with the Middle Name token.

Click Use Separator to remove the separator that is applied between substrings. To specify a different separator, click a token and click the edit icon. The default separator string is a single blank space.

No Solution tab

Use the No Solution tab to define default mappings for the output of this node. These mappings will be used only when the conditions that are specified on the Solutions-Based tab are not found.

NoteNote: If a Parse definition is embedded in another definition, the No Solution mappings are ignored unless the definition in which the Parse definition is embedded is a Match definition.

Under Map categories when no solution is found, click a token and click the plus icon to specify a maximum number of words for a category. You can specify different numbers for different categories.

Under Categories to exclude, specify the categories that will be prevented from appearing in the node output. No categories are excluded by default. Typically excluded categories are those such as COMMA and DASH that apply to punctuation marks.

Click Default Token to change the token that receives all input substrings that are not assigned to tokens. The default value (None) discards substrings that are not assigned to tokens.

Extraction Definition

Use token separator

If the Use token separator check box is selected, the displayed token separator will be inserted between substrings. The default separator is a semi-colon (;). Click in the separator field to change the separator.

Default Token

In the Default Token field, select a token that receives unassigned substrings. Available selections include tokens, an Additional Info token, and a value of (None). Select (None) to discard unassigned substrings. Additional Info is selected by default.

Output

Results

Table with two columns:

Documentation Feedback: yourturn@sas.com
Note: Always include the Doc ID when providing documentation feedback.

Doc ID: DMCust_12328.html