Operating Hive with ZooKeeper
It is important to configure high availability in production so that if one of the hiveserver2
fails, the others can respond to client requests. This can be achieved by using the ZooKeeper discovery mechanism to point the clients to the active Hive servers.
Secondly, to enable concurrency, it is important to run the table manager, which is a lock manager. Both these features require setting up the ZooKeeper ensemble and configuring the Hive clients to use it.
Getting ready
To progress through the recipe in this section, we need a ZooKeeper ensemble running. Please refer to Chapter 4, High Availability for details of how to configure ZooKeeper clients. Secondly, users must have completed the previous recipe Using MySQL for Hive metastore.
How to do it...
Connect to the edge node
edge1.cyrus.com
in the cluster and switch to thehadoop
user.Modify the
hive-site.xml
file and enable the table manager by using the properties as follows. This is for concurrency:<property...