Introduction
With the fast adaptation of Hadoop/HBase ecosystem, various applications, both real-time and batch processing, are built on top of it. It's of utmost importance to plan the entire architecture to have built-in elasticity, regional datacenter adaptation, globally distributed redundancy, and multi-layered globally distributed architecture which ties in with hardware, software, network and growing demands of customers, as the business matures.
In the early years of Hadoop/HBase most of the use cases were driven by batch processing systems, but the trend is converging to make a single platform, which can equally scale if near real-time use cases are used against this eco-system.
It is essential to do a proper planning of the architecture before setting up a very large scale HBase system; this will allow the system to do the following:
- Scale elastically or Auto Scaling with built-in fault tolerance
- Works on different VM/physical and cloud hardware
- Gives near real-time throughput...