Enterprise search data-processing patterns
Enterprise searching has evolved over time from a basic web-crawling document search to a more sophisticated structured/unstructured content search providing a lot of user interactions. As the data grows, there is a paradigm shift, and more focus is shifting towards the effective use of distributed technology to handle such a high volume of data. At the same time, the cost of enterprise storage needs to be controlled. Enterprise-ready search also demands support for high availability and scalability. By design, the enterprise search implementation should be capable of handling large indexes. With more growth, single server capacity of handling index becomes a limitation of the search server. In this case, sharding of index is most important.
Tip
Sharding is a process of breaking one index into multiple logical units called "shards" across multiple records. In case of Solr, the results will be aggregated and returned.
Let's look at different...