For the capability that we are looking for, Elasticsearch is the leading technology and that's the main reason for this choice.
Some of the prominent reasons why Elasticsearch has been chosen as the technology of choice for the technical capability that we are looking for in our Data Lake implementation:
- Compatibility with Hadoop (as this is our persistent store)
- Distributed
- Scalable
- Capability of indexing data
- Highly performant (fast query and search)
- Battle hardened technology (enterprise-grade having all the capabilities required by an enterprise)
- Capability of handling a huge volume and variety of data
- Failover and data redundancy capability