Fuzzy string matching
When searching for strings in the SQL Database Engine using =
, the strings must match exactly for the expression to evaluate to true. If we want to match only part of the string, however, we must use a LIKE
operator with wildcards. If we want to search for a pattern anywhere within a string, we need both leading and trailing wildcards. The problem with this is that it prevents us from being able to use an index or accurately estimate the cardinality. An index with a string key is sorted starting with the first character of the string, but if we are searching for a pattern that may appear in the middle of the string, the SQL Database Engine must scan every value and search for the matching pattern in each string in the column. A LIKE
operator with a leading wildcard (%a value
or %a value%
) almost always causes a scan operation.
Consider an example from the AdventureWorks
database where we want to find all the Flat Washers in the Product
table. We know they all...