Deploying Hive on a Hadoop cluster
Hive is supported by a wide variety of platforms. GNU/Linux and Windows are commonly used as the production environment, whereas Mac OS X is commonly used as the development environment.
Getting ready
In this book, we will assume a GNU/Linux-based installation of Apache Hive for installation and other instructions.
Before installing Hive, the first step is to make sure that a Java SE environment is installed properly. Hive requires version 6 or later, which can be downloaded from http://www.oracle.com/technetwork/java/javase/downloads/index.html.
How to do it...
To install Hive, just download it from http://Hive.apache.org/downloads.html and unpack it. Choose the latest stable version.
Note
At the time of writing this book, Hive 1.2.1 was the latest stable version available.
How it works…
By default, Hive is configured to use an embedded Derby database whose disk storage location is determined by the Hive configuration variable named javax.jdo.option.ConnectionURL
. By default, this location is set to the /metastore_dbinconf/hive-default.xml
file. Hive with Derby as metastore in embedded mode allows at most one user at a time.
The other modes of installation are Hive with local metastore and Hive with remote metastore, which will be discussed later.