Chapter 1. Understanding the HBase Ecosystem
HBase is a horizontally scalable, distributed, open source, and a sorted map database. It runs on top of Hadoop file system that is Hadoop Distributed File System (HDFS). HBase is a NoSQL nonrelational database that doesn't always require a predefined schema. It can be seen as a scaling flexible, multidimensional spreadsheet where any structure of data is fit with on-the-fly addition of new column fields, and fined column structure before data can be inserted or queried. In other words, HBase is a column-based database that runs on top of Hadoop distributed file system and supports features such as linear scalability (scale out), automatic failover, automatic sharding, and more flexible schema.
HBase is modeled on Google BigTable. It was inspired by Google BigTable, which is compressed, high-performance, proprietary data store built on the Google file system. HBase was a developed as a Hadoop subproject to support storage of structural data, which can take advantage of most distributed files systems (typically, the Hadoop Distributed File System known as HDFS).
The following table contains key information about HBase and its features:
Features |
Description |
---|---|
Developed by |
Apache |
Written in |
Java |
Type |
Column oriented |
License |
Apache License |
Lacking features of relational databases |
SQL support, relations, primary, foreign, and unique key constraints, normalization |
Website | |
Distributions |
Apache, Cloudera |
Download link | |
Mailing lists |
|
Blog |