Although Hadoop was getting popular after its invention, it was still only suitable for batch processing use cases where a huge set of data could be processed in a single batch. Hadoop came from the Google research paper called Hadoop Distributed File System (HDFS) from the Google File System Research paper and MapReduce from the Google MapReduce research paper. Google has one more popular product, which is Big Table, and to support random read/write access over large sets of data, HBase was discovered. HBase runs on top of Hadoop and uses the scalability of Hadoop by running its daemon—HDFS, with real-time data access as a key/value store.Â
Apache HBase is an open source, distributed, NoSQL database that provides real-time random read/write access to large datasets over HDFS.