SAS Quality Knowledge Base for Contact Information 26
Pattern analysis definitions specify data and logic that are used to determine the format of a data string.
The output of a pattern analysis definition is a text-based pattern. Typically the output pattern will be composed of a series of letters that represent the type of characters found in the string -- numbers, letters, punctuation, and so on.
For example, a pattern analysis definition may use letters as abbreviations for types of characters as follows:
Abbreviation | Character Type |
---|---|
A | Uppercase Letter |
a | Lowercase Letter |
9 | Number |
* | Other |
The results of applying this pattern analysis definition to a string would then be as follows:
Input | Output |
---|---|
1 877-846-Flux | 9 999*999*Aaaa |
You can use a pattern analysis definition to perform analytics or to check for values that you consider invalid. For instance, if you have a table of social security numbers, you might perform pattern analysis to find all records that do not have this format:
999*99*999
Records with an invalid format could then be flagged, corrected, or discarded.
Similarly, you might use a pattern analysis definition to implement business rules that prevent users from entering malformed data values into a table.
Documentation Feedback: yourturn@sas.com |
Doc ID: QKBCI_pattern_analysis_defs.html |