Architecture
First, let's take a look at some of the terminology that is specific to HBase:
- Table: A table in HBase roughly means the same thing as a table in an RDBMS. Data in an HBase cluster is organized into tables, which are distributed across a set of nodes:
HBase tables are divided into Regions, which are assigned to RegionServers.
- Namespace: A collection of tables is stored in a namespace. Typically, an application gets its own namespace on the cluster. The application consists of a bunch of tables stored within that namespace.
- Region: An HBase table is broken up into individual shards, or partitions, called regions, which are distributed across nodes in the cluster. Each region corresponds to a unique slice of the key space. A key and the value stored against that key uniquely maps to a region, based on the key range that it falls within. A region is the basic unit of assignment of data within the cluster.
- RegionServer: The HBase RegionServer is the JVM instance that hosts a given region...