Hive packages
The following are the various sections included in Hive packages.
Getting ready
Hive source consists of different modules categorized by the features they provide or as a submodule of some other module.
How to do it...
The following is the list of Hive modules and their usage in Hive:
accumulo-handler
: Apacheaccumulo
is a distributed key-value datastore based on Google Big Table. This package includes the components responsible for mapping the Hive table to theaccumulo
table.AccumuloStorageHandler
andAccumuloPredicateHandler
are the main classes responsible for mapping tables. For more information, refer to the official integration documentation available at https://cwiki.apache.org/confluence/display/Hive/AccumuloIntegration.ant
: This tool is used to build earlier versions of Hive source. Ant is also needed to configure the Hive Web Interface server.beeline
: A Hive client used to connect with HiveServer2 and run Hive queries.bin
: This package includes scripts to start Hive clients and services.cli
: This is a Hive Command-line Interface implementation.common
: These are utility classes used by other modules.conf
: This contains default configurations and uses defined configuration objects.contrib
: This containsSerdes
, genericUDF
, andfileformat
contributed by third parties to Hive.hbase-handler
: This module allows Hive SQL statements to access HBase tables forSELECT
andINSERT
commands. It also provides interfaces to access HBase and Hive tables forjoin
andunion
in a single query. More information is available at https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration.hcatalog
: This is a table management framework that helps other frameworks such as Pig or MapReduce to access the Hive metastore and table schema.hwi
: This module provides an implementation of a web interface to run Hive queries. Also, theWebHCat
APIs provideREST
APIs to access the Hive metastore.Jdbc
: This is a connector that accepts JDBC connections and calls to execute Hive queries on the cluster.Metastore
: This is the API that provides access to metastore entities including database, table, schema, and serdes.odbc
: This module implements the Open Database Connectivity (ODBC) API, enabling ODBC applications to connect and execute queries over Hive.ql
: This module provides an interface to clients that checks for query semantics and provides an implementation for driver, parser, and query planner.Serde
: This module has an implementation of serializer and deserializer used by Hive to read and write data. It helps in validating and parsing record and field types.shims
: This is the module that transparently intercepts and modifies calls to the Hive API, usually for compatibility purposes.spark-client
: This module provides an interface to execute Hive SQLs on a Spark framework.