We have already learned about an inverted index. We know that Elasticsearch stores a document into an inverted index. This transformation is known as analysis. This is required for a successful response of the index search query.
Also, many of the times, we need to use some kind of transformation before sending that document to Elasticsearch index. We may need to change the document to lowercase, stripping off HTML tags if any from the document, remove white space between two words, tokenize the fields based on delimiters, and so on.
Elasticsearch offers the following built-in analyzers:
- Standard analyzer: It is a default analyzer. This uses standard tokenizer to divide text. It normalizes tokens, lowercases tokens, and also removes unwanted tokens.
- Simple analyzer: This analyzer is composed of lowercase tokenizer.
- Whitespace analyzer: This uses the whitespace tokenizer...