COMPLEV Function

Returns the Levenshtein edit distance between two strings.

Category: Character
Restriction: I18N Level 0 functions are designed for use with Single Byte Character Sets (SBCS) only.

Syntax

COMPLEV(string-1, string-2 <,cutoff> <,modifiers> )

Required Arguments

string–1

specifies a character constant, variable, or expression.

string–2

specifies a character constant, variable, or expression.

Optional Arguments

cutoff

specifies a numeric constant, variable, or expression. If the actual Levenshtein edit distance is greater than the value of cutoff, the value that is returned is equal to the value of cutoff.

Tip
Using a small value of cutoff improves the efficiency of COMPLEV if the values of string–1 and string–2 are long.

modifiers

specifies a character string that can modify the action of the COMPLEV function. You can use one or more of the following characters as a valid modifier:

i or I ignores the case in string–1 and string–2.
l or L removes leading blanks in string–1 and string–2 before comparing the values.
n or N removes quotation marks from any argument that is an n-literal and ignores the case of string–1 and string–2.
: (colon) truncates the longer of string–1 or string–2 to the length of the shorter string, or to one, whichever is greater.
Tip
COMPLEV ignores blanks that are used as modifiers.

Details

The order in which the modifiers appear in the COMPLEV function is relevant.
  • “LN” first removes leading blanks from each string and then removes quotation marks from n-literals.
  • “NL” first removes quotation marks from n-literals and then removes leading blanks from each string.
The COMPLEV function ignores trailing blanks.
COMPLEV returns the Levenshtein edit distance between string-1 and string-2. Levenshtein edit distance is the number of insertions, deletions, or replacements of single characters that are required to convert one string to the other. Levenshtein edit distance is symmetric. That is, COMPLEV(string-1,string-2) is the same as COMPLEV(string-2,string-1).

Comparisons

The Levenshtein edit distance that is computed by COMPLEV is a special case of the generalized edit distance that is computed by COMPGED.
COMPLEV executes much more quickly than COMPGED.

Example

The following example compares two strings by computing the Levenshtein edit distance.
data test;
   infile datalines missover;
   input string1 $char8. string2 $char8. modifiers $char8.;
   result=complev(string1, string2, modifiers);
   datalines;
1234567812345678
abc     abxc
ac      abc
aXc     abc
aXbZc   abc
aXYZc   abc
WaXbYcZ abc
XYZ     abcdef
aBc     abc
aBc     AbC      i
  abc   abc
  abc   abc      l
AxC     'abc'n
AxC     'abc'n   n
;

proc print data=test;
run; 
Results of Comparing Two Strings by Computing the Levenshtein Edit Distance
Results of Comparing Two Strings by Computing the Levenshtein Edit Distance

See Also

CALL Routines: