Routing documents to a particular shard
As we have seen, SolrCloud automatically distributes documents to different shards in the index. The queries on the cloud accumulate results from all the different shards and send them back. Why then would we want to route documents to a particular shard?
Suppose that we have a huge cluster of servers as part of SolrCloud—say 100 servers—with 30 shards and 3 replicas for each shard. This gives us ample room to manage a large-scale index expanding to some terabytes of data. A query to get the documents from the index based on a criterion would go to all the 30 shards in the index to get the results. The machine on which the query is executed would accumulate results from all the 30 shards and create the final result set. This would involve huge movement of data between shards and the shard performing the merge operation on the results will have to do some heavy processing, since it would move through 30 different result sets and merge them into a single...