The MapReduce execution goes through various steps and each step has scope for a little optimization. In the previous sections, we have covered the components of the MapReduce framework and now we will briefly look into the MapReduce execution flow, which will help us understand how each component interacts with each other. The following diagram gives a brief overview about the MapReduce execution flow. We have divided the diagram into smaller parts so that each step looks easier to understand. The step numbers are mentioned over arrow connectors and the last arrow in the diagram connects to the following diagram in the section:
We will explain the different steps of the MapReduce internal flow here as follows:
- The InputFormat is the starting point of any MapReduce application. It is defined in the job configuration in the...