We have seen an overview of text analysis. Now let's dive deeper and understand the core processes running behind the scenes of analysis. As we have seen previously, the analyzer, tokenizer and filter are the three main components Solr uses for text analysis. Let's explore an analyzer.Â
Understanding analyzer
What is an analyzer?
An analyzer examines the text of fields and generates a token stream. Normally, only fields of type solr.TextField will specify an analyzer. An analyzer is defined as a child element of the <fieldType> element in the managed-schema.xml file. Here is a simple analyzer configuration:
<fieldType name="text_en" class="solr.TextField" positionIncrementGap...