The Elasticsearch out-of-the-box tools
Elasticsearch primarily works with two models of information retrieval: the Boolean model and the Vector Space model. In addition to these, there are other scoring algorithms available in Elasticsearch as well, such as Okapi BM25, Divergence from Randomness (DFR), and Information Based (IB). Working with these three models requires extensive mathematical knowledge and needs some extra configurations in Elasticsearch, which are beyond the scope of this book.
The Boolean model uses the AND
, OR
, and NOT
conditions in a query to find all the matching documents. This Boolean model can be further combined with the Lucene scoring formula, TF/IDF (which we have already discussed in Chapter 2, Understanding Document Analysis and Creating Mappings), to rank documents.
The vector space model works differently from the Boolean model, as it represents both queries and documents as vectors. In the vector space model, each number in the vector is the weight of a term...