Finding the best
The search domain is present on various levels of abstraction: finding a word in a body of text is typically more complex than simply calling the contains()
function, and if there are several results, which is the one that was searched for? This entire class of problem is summed up under the umbrella of information retrieval, where problems of ranking, indexing, understanding, storing, and searching are solved in order to retrieve the optimum result (for all definitions). This chapter focuses only on the latter part, where we actually look through a collection of items (for example, an index) in order to find a match.
This means that we will compare items directly (a == b) to determine closeness, rather than using something such as a distance - or locally-sensitive hashing function. These can be found in more specific domains such as a fuzzy search or matching bodies of text, which is a field of its own. To learn more about hashing, please check out Chapter 16, Exploring...