The origin of HBase
Looking at the limitations of GFS and MR, Google approached another solution, which not only uses GFS for data storage but it is also used for processing the smaller data files very efficiently. They called this new solution BigTable.
Note
BigTable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers.
Welcome to the world of HBase, http://hbase.apache.org/. HBase is a NoSQL database that primarily works on top of Hadoop. HBase is based on the storage architecture followed by the BigTable. HBase inherits the storage design from the column-oriented databases and the data access design from the keyvalue store databases where a key-based access to a specific cell of data is provided.
Note
In column-oriented databases, data grouped by columns and column values is stored contiguously on a disk. Such a design is highly I/O effective when dealing with very large data sets used for analytical queries where not all the columns are needed.
HBase can be defined as a sparse, distributed, persistent, multidimensional sorted map, which is indexed by a row key, column key, and timestamp. HBase is designed to run on a cluster of commodity hardware and stores both structured and semi-structured data. HBase has the ability to scale horizontally as you add more machines to the cluster.