Configuring HiveServer2 high availability
HiveServer2 for a cluster of thousands of nodes could be a single point of failure if HiveServer2 is not configured with a high availability concept. If HiveServer2 service goes down, none of the clients would be able to access metastore or submit Hive queries to cluster. To solve this limitation, high availability of HiveServer2 is configured. It needs a ZooKeeper quorum running on a set of nodes.
ZooKeeper is an open source centralized service for providing coordination between distributed applications. It is also used to store some common configuration and metadata to provide distributed synchronization. Hive uses ZooKeeper to store configuration information to provide high availability of HiveServer2.
Getting ready
For configuring high availability of HiveServer2, you will need a ZooKeeper quorum running.
Tip
The installation of ZooKeeper is not in the scope of this book. You can refer to the following links for the installation of ZooKeeper.
- Refer...