Ignore the defined words from being searched
Imagine a situation where you wish to filter out offensive words from the indexed data. Such words need to be ignored and shouldn't be searchable. Can we provide such a capability to Solr? Yes, of course; we can do that and we will understand how to do it in this section.
In order to avoid using offensive words in the demonstration, we will use the term offensive
, which denotes any offensive word we would like to filter out from being searched.
In order to start, we will define the following index structure in the fields
section of our schema.xml
file:
<field name="o_id" type="string" indexed="true" stored="true" required="true" /> <field name="o_name" type="text_offensive" indexed="true" stored="true" />
Now, let us define the text_offensive
field type in the types
section of our schema.xml
file as follows:
<fieldType name="text_offensive" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class...