Chapter 3. An Introduction to Hadoop's Architecture and Ecosystem
From this chapter onwards, we start with the implementation aspects of Machine learning. Let's start learning the platform of choice—a platform that can scale to Advanced Enterprise Data needs (big data needs of Machine learning in specific)—Hadoop.
In this chapter, we cover Hadoop platform and its capabilities in addressing large-scale loading, storage, and processing challenges for Machine learning. In addition to an overview of Hadoop Architecture, its core frameworks, and the other supporting ecosystem components, also included here is a detailed installation process with an example deployment approach. Though there are many commercial distributions of Hadoop, our focus in this chapter is to cover the open source, Apache distribution of Hadoop (latest version 2.x).
In this chapter, the following topics are covered in-depth:
- An introduction to Apache Hadoop, its evolution history, the core...