DataFlux Data Management Studio 2.6: User Guide

Regex Library Editor

Regular expressions are powerful tools you can use to transform data. In the DataFlux Data Management Studio implementation, regular expressions are organized into libraries that you can use for parsing, standardization, and matching. To build and test these libraries, use the Customize Regex Library Editor. ("Regex" is short for "regular expression.")

In the context of parse definitions, standardization definitions, and match definitions, regular expressions are primarily intended for character-level cleansing and transformations. For word- and phrase-level transformations, you should instead use standardization data libraries.

Regular expressions you create in the Regex Library Editor must adhere to the syntax defined for Perl regular expressions. For information on writing Perl regular expressions, see Mastering Regular Expressions by Jeffrey E.F. Friedl or other readily available reference material on the subject.

Building Regex Libraries

  1. Open the Regex Library Editor. On the Customize main screen, select Tools > Other QKB Editors > Regex Library Editor. The Regex Library Editor dialog appears.
  2. Set a QKB. Select Options > Set QKB. The Regex Library Editor window appears. Select the appropriate QKB and locale and click Open. The QKB and locale setting is saved from session to session, so you do not need to specify it again unless you need to build a Regex Library file for a different locale.
  3. Create a New Regex Library File. Select File > New.
  4. Add Your Regular Expressions. By default, when you create a new Regex Library file, the Regex Library Editor will ask you to supply the first Regular Expression/Substitution pair. Type the Regular Expression and the Substitution in the appropriate fields and click OK.



    Before you add more regular expressions, be aware that regular expressions will be applied sequentially in the order in which they appear in the Regex Library. So, it is possible that a value could be modified by one regular expression, and then the result of that modification could be modified again by a regular expression that appears further down. This could occur several times as DataFlux Data Management Studio applies each regular expression from top to bottom. Therefore, you should be careful not to subvert the effects of one regular expression with another.



    With that in mind, all new regular expression rules are added to the end of the list. Click on the newly created row and drag the rule to the desired position in the list. Repeat this process until you have defined all your regular expressions.
  5. Save Your Regex Library. Select File > Save. Because this is a new Regex Library, the Regex Library Editor will prompt you for a file name.
  6. Test Your Regex Library. After creating a Regex Library, you can use the Regex Library Editor to test your expressions. At the bottom of the Regex Library is the Test Area, which allows you to type sample input strings to verify that you have written your regular expressions correctly. Type an input string and observe the result. If the result is not what you intended, you can modify your regular expressions or their order and re-test. When you are satisfied with the results, be certain to save your changes. Note that your input string in the Test Area is highlighted where it appears in your regular expressions.

Modifying Regex Library Files

After using the Regex Library Editor Test Area to test your regular expressions, you might find some unintended effects because of the order of your expressions.

To alter the expression order in a Regex Library:

  1. In the Regex Library Editor, select the row you want to move.
  2. Drag the row to the desired location, and then release the mouse button. The row appears in its new position.

Tip Tips:


Related Topics

Documentation Feedback: yourturn@sas.com
Note: Always include the Doc ID when providing documentation feedback.

Doc ID: dfU_Cstm_RegEx_16000.html