String Distance Functions

These functions provide standard algorithms for determining the similarity of two strings. They each accept two strings as parameters and return a numeric score where a lower score indicates more similar strings. The specific meaning of the numeric score depends on the algorithm. These functions support ASCII strings only; they will throw an error if a string containing non-ASCII characters is provided. These functions are provided by the sqlean "fuzzy" extension which is built into SQL Notebook.

Syntax

DLEVENSHTEIN(x, y-- Damerau-Levenshtein distance
EDIT_DISTANCE(
x, y) -- Spellcheck edit distance
HAMMING(
x, y)       -- Hamming distance
JARO_WINKLER(
x, y-- Jaro-Winkler distance
LEVENSHTEIN(
x, y)   -- Levenshtein distance
OSA_DISTANCE(x, y-- Optimal string alignment distance

Return Value

Numeric string distance. The specific meaning depends on the algorithm.

Example

PRINT LEVENSHTEIN('pickle', 'pickle'); -- 0
PRINT LEVENSHTEIN('pickle', 'tickle'); -- 1
PRINT LEVENSHTEIN('pickle', 'stick'); -- 4
PRINT LEVENSHTEIN('pickle', '🙂stick'); -- error: non-ASCII string