Using scripting update processors to modify documents
Sometimes, we need to modify documents during indexing, and we don't want to do this on the indexing application side. For example, we have documents describing the Internet sites. What we want to be able to do is filter the sites on the basis of the protocol used, for example, http
or https
. We don't have this information; we only have the whole URL address. Let's see how we can achieve this with Solr.
Getting ready
Before continuing with the following recipe, I suggest reading the Counting the number of fields recipe of this chapter to get used to updating request processor configuration.
How to do it...
The following steps will take you through the process of achieving our goal:
First, we start with the index structure, putting the following section in the
schema.xml
file:<field name="id" type="string" indexed="true" stored="true" required="true" /> <field name="url" type="text_general" indexed="true" stored="true"/> <field...