Chapter 1. Introduction to MapReduce
In this first chapter, we will take a look at the core technologies used in the distributed model of Hadoop; more specifically, we cover the following:
- The Hadoop platform and the framework it provides
- The MapReduce programming model
- Technologies built on top of MapReduce that provide an abstraction layer and an API that is easier to understand and work with
In the following diagram, Hadoop stands at the base, and MapReduce as a design pattern enables the execution of distributed jobs. MapReduce is a low-level programming model. Thus, a number of libraries such as Cascading, Pig, and Hive provide alternative APIs and are compiled into MapReduce. Cascading, which is a Java application framework, has a number of extensions in functional programming languages, with Scalding being the one presented in this book.