In this chapter, we studied the different components of Hadoop's overall ecosystem and their tools for solving many complex industrial problems. We went through a brief overview of the tools and software that run on Hadoop, specifically Apache Kafka, Apache PIG, Apache Sqoop, and Apache Flume. We also covered SQL and NoSQL-based databases on Hadoop, which included Hive and HBase respectively.
In the next chapter, we will take a look at some analytics components along with more advanced topics in Hadoop.