Other Apache projects
Whether you use a bundled distribution or stick with the base Apache Hadoop download, you will encounter many references to other, related Apache projects. We have covered Hive, Sqoop, and Flume in this book; we'll now highlight some of the others.
Note that this coverage seeks to point out the highlights (from my perspective) as well as give a taste of the wide range of the types of projects available. As before, keep looking out; there will be new ones launching all the time.
HBase
Perhaps the most popular Apache Hadoop-related project that we didn't cover in this book is HBase ; its homepage is at http://hbase.apache.org. Based on the BigTable model of data storage publicized by Google in an academic paper (sound familiar?), HBase is a non-relational data store sitting atop HDFS.
Whereas both MapReduce and Hive tasks focus on batch-like data access patterns, HBase instead seeks to provide very low latency access to data. Consequently, HBase can, unlike the already mentioned...