Sharding and fault tolerance
We have already seen sharding, collection, and replicas in Chapter 6, Distributed Search Using Apache Solr. In this section, we will look at some of the important aspects of sharding and how it plays a role in scalability and high availability. The strategy to create new shards is highly dependent upon the hardware and shard size. Let's say, you have two machines, A and B, of the same configuration, each with one shard. Shard A is loaded with 1 million index documents, and shard B is loaded with 100 documents. When a query is fired, the query response to any Solr query is determined by the query response of the slowest node (in this case, shard A). Hence, a shard with near to equal shard sizes can perform better in this case.
Document routing and sharding
We have seen the leader-selection process in Chapter 6, Distributed Search Using Apache Solr. Typically, when any enterprise search is deployed, the size of documents to be indexed keeps growing over time. As...