Tuning segment merging
As you might know, a Lucene index is built of one or more segments. In general, a segment is a write-once, read-many data structure, which means that once written it won't be updated (only some parts of it will be, such as information about a deleted document). Segment merging is a process of combining multiple segments to a new one to reduce the overall number of segments the index is built of. The reason Lucene does this is because of performance—the smaller the number of segments, the better the search performance is. On the other hand, segment merge is a resource-intensive process as it requires you to read the old segments and write the new ones. Because of all this, it is good to know how to tune segment merging for our own purposes and this recipe will show you how to do that.
How to do it...
- For the purpose of this recipe, I assume that we are starting with the basic Solr configuration, which looks as follows when it comes to segment merging (we...