You are here: Definition Types>Pattern Analysis Definitions

SAS Quality Knowledge Base for Contact Information 26

Pattern Analysis Definitions

Pattern analysis definitions specify data and logic that are used to determine the format of a data string.

The output of a pattern analysis definition is a text-based pattern. Typically the output pattern will be composed of a series of letters that represent the type of characters found in the string -- numbers, letters, punctuation, and so on.

For example, a pattern analysis definition may use letters as abbreviations for types of characters as follows:

Abbreviation Character Type
A Uppercase Letter
a Lowercase Letter
9 Number
* Other

The results of applying this pattern analysis definition to a string would then be as follows:

Input Output
1 877-846-Flux 9 999*999*Aaaa

You can use a pattern analysis definition to perform analytics or to check for values that you consider invalid. For instance, if you have a table of social security numbers, you might perform pattern analysis to find all records that do not have this format:

999*99*999

Records with an invalid format could then be flagged, corrected, or discarded.

Similarly, you might use a pattern analysis definition to implement business rules that prevent users from entering malformed data values into a table.