MapReduce management
As we saw in the previous chapter, the MapReduce framework is generally more tolerant of problems and failures than HDFS. The JobTracker and TaskTrackers have no persistent data to manage and, consequently, the management of MapReduce is more about the handling of running jobs and tasks than servicing the framework itself.
Command line job management
The hadoop job
command-line tool is the primary interface for this job management. As usual, type the following to get a usage summary:
$ hadoop job --help
The options to the command are generally self-explanatory; it allows you to start, stop, list, and modify running jobs in addition to retrieving some elements of job history. Instead of examining each individually, we will explore the use of several of these subcommands together in the next section.
Have a go hero – command line job management
The MapReduce UI also provides access to a subset of these capabilities. Explore the UI and see what you can and cannot do from the...