Distributed search
Distributed search in Solr is a concept of splitting an index into multiple shards, querying, and/or merging results across these shards. Imagine a situation where either the index is too huge to fit on a single system, or you have a query which takes too long to execute. How would you handle such situations? Don't worry! We have distributed search concept in Solr which is especially designed to handle such situations.
Let us consider the above stated scenario where you need to apply distributed search concept in order to overcome the huge index and/or query execution time concerns.
To overcome this situation, you need to distribute a request across ALL shards in a list using the shard parameter. Our request would follow this syntax:
host:port/base_url[,host:port/base_url]
Note
You can add n-number of hosts in a single request. This means that the number of hosts you add, the number of shards you are distributing your request. Additionally, the shard count would depend upon how expensive your query is or how huge your index is.
A sharded request will go to the standard request handler (not necessarily the original); however we can override it using shards.qt
. The following are the list of components that support distributed search:
Query component
Facet component
Highlighting component
Stats component
Spell check component
Terms component
Term vector component
Debug component
Grouping component
On the contrary, distributed search has a list of limitations which are:
Unique key requirements
No distributed IDF
Doesn't support QueryElevationComponent
Doesn't support Join
Index variations between stages
Distributed Deadlock
Distributed Indexing
Tip
Downloading the example code
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.