Hive performance tuning
In this recipe, we will cover Hive tuning by touching upon some important parameters. Hive is a data warehousing solution which runs on top of Hadoop, as discussed in Chapter 7, Data Ingestion and Workflow. Please refer to it for installation and configuration of Hive.
Getting ready
Make sure that the user has a running cluster with Hive installed and configured to run with the ZooKeeper ensemble. Users can refer to Chapter 7, Data Ingestion and Workflow on Hive, for configuring that.
How to do it...
Connect to the Edge node
client1.cyrus.com
and switch to thehadoop
user.If you have followed the previous recipes, Hive is installed at
/opt/cluster/hive
on the Edge node.The first thing is to tune the JVM heap used, when Hive is started by the shell as shown in the following screenshot, to the file
hive-env.sh
file:Configure the local Hive scratch space on a separate disk by using the following configuration:
<property> <name>hive.exec.local.scratchdir</name...