Functions and CALL Routines |
Category: | Character |
Restriction: | I18N Level 0 |
Syntax | |
Arguments | |
Details | |
Length of Returned Variable | |
The Basics | |
Comparisons | |
Examples | |
See Also |
Syntax |
SPEDIS(query,keyword) |
identifies the word to query for the likelihood of a match. SPEDIS removes trailing blanks before comparing the value.
specifies a target word for the query. SPEDIS removes trailing blanks before comparing the value.
Details |
In a DATA step, if the SPEDIS function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes.
SPEDIS returns the distance between the query and a keyword, a nonnegative value that is usually less than 100 but never greater than 200 with the default costs.
SPEDIS computes an asymmetric spelling distance between two words as the normalized cost for converting the keyword to the query word by using a sequence of operations. SPEDIS(QUERY, KEYWORD) is not the same as SPEDIS(KEYWORD, QUERY).
Costs for each operation that is required to convert the keyword to the query are listed in the following table:
The distance is the sum of the costs divided by the length of the query. If this ratio is greater than one, the result is rounded down to the nearest whole number.
Comparisons |
The SPEDIS function is similar to the COMPLEV and COMPGED functions, but COMPLEV and COMPGED are much faster, especially for long strings.
Examples |
options nodate pageno=1 linesize=64; data words; input Operation $ Query $ Keyword $; Distance = spedis(query,keyword); Cost = distance * length(query); datalines; match fuzzy fuzzy singlet fuzy fuzzy doublet fuuzzy fuzzy swap fzuzy fuzzy truncate fuzz fuzzy append fuzzys fuzzy delete fzzy fuzzy insert fluzzy fuzzy replace fizzy fuzzy firstdel uzzy fuzzy firstins pfuzzy fuzzy firstrep wuzzy fuzzy several floozy fuzzy ; proc print data = words; run;
The output from the DATA step is as follows.
The SAS System 1 Obs Operation Query Keyword Distance Cost 1 match fuzzy fuzzy 0 0 2 singlet fuzy fuzzy 6 24 3 doublet fuuzzy fuzzy 8 48 4 swap fzuzy fuzzy 10 50 5 truncate fuzz fuzzy 12 48 6 append fuzzys fuzzy 5 30 7 delete fzzy fuzzy 12 48 8 insert fluzzy fuzzy 16 96 9 replace fizzy fuzzy 20 100 10 firstdel uzzy fuzzy 25 100 11 firstins pfuzzy fuzzy 33 198 12 firstrep wuzzy fuzzy 40 200 13 several floozy fuzzy 50 300
See Also |
|
Copyright © 2011 by SAS Institute Inc., Cary, NC, USA. All rights reserved.