Working with Accumulo
In this recipe, you will learn how to integrate Hive with Apache Accumulo.
Apache Accumulo is a sparse, distributed, sorted, and multidimensional map of key-value pairs. It is modeled after Google's Bigtable design. It's a key-value store and handles structured, semi-structured, and unstructured data. Also, it is extremely fast in accessing data to and fro tables containing large volumes of data.
Getting ready
In this topic, we will cover the use of Hive and Accumulo. You must have Apache Accumulo installed on your system before going further in the topic.
For Apache integration with Hive, there are two main components as follows:
AccumuloStorageHandler
: The main job of this class is to map the Hive table to the Accumulo tables. Also, it configures the Hive queries.AccumuloPredicateHandler
: The main job of this class is to work on filter operations for the reduction of data. It pushes filters to Accumulo for the reduction of data. The following four properties...