DataFlux Data Management Studio 2.6: User Guide
An extraction definition extracts parts of the input string and assigns them to corresponding tokens of the associated data type.
Input: a string
Example:
"100 Slightly used green Acme XJF-100 raygun $100 c/w lots of shiny buttons"
Output: mapping between tokens and substrings
Example:
Quantity => 100
Brand => "Acme"
Model => "XJF-100"
Color => "green"
Price => "$100"
Description => "Slightly used raygun c/w lots of shiny buttons"
Hierarchy | Node/Group | Container Group | Count |
---|---|---|---|
1 | Extraction Definition Head Node | 1 | |
2 | Preprocessing Regex Library Group | 1 | |
2.1 | Preprocessing Regex Library Node | Preprocessing Group | 0 or more |
3 | Chopping Group | 1 | |
3.1 | Chop Table Node | Chopping Group | 1 |
4 | Table-Based Extraction Group | 1 | |
4.1 | Extraction Scheme Node | Table-Based Extraction Group | 0 or more |
5 | Pattern-Based Extraction Group | 1 | |
5.1 | Morph Analysis Group | Pattern-Based Extraction Group | 1 (*) |
5.1.1 | Lookup Normalization Group | Morph Analysis Group | 1 (*) |
5.1.1.1 | Uppercasing Node | Lookup Normalization Group | 1 (*) |
5.1.1.2 | Normalization Regex Libraries Group | Lookup Normalization Group | 1 (*) |
5.1.1.2.1 | Normalization Regex Library Node | Normalization Regex Libraries Group | 0 or more (*) |
5.1.2 | Vocabularies Group | Morph Analysis Group | 1 (*) |
5.1.2.1 | Vocabulary Node | Vocabularies Group | 1 or more (*) |
5.1.3 | Categorization Regex Libraries Group | Morph Analysis Group | 1 (*) |
5.1.3.1 | Categorization Regex Library Node | Categorization Regex Libraries Group | 0 or more (*) |
5.1.4 | Number Check Node | Morph Analysis Group | 1 (*) |
5.1.5 | Default Categories Node | Morph Analysis Group | 1 (*) |
5.2 | Pattern Recognition Group | Pattern-Based Extraction Group | 1 (*) |
5.2.1 | Pattern Logic Node | Pattern Recognition Group | 1 or more |
6 | Token Mappings Node | 1 |
Documentation Feedback: yourturn@sas.com
|
Doc ID: DMCust_DefsExtract_12200.html |