Coordinating job execution
A lot of times, we need to coordinate the execution of Scalding jobs and even mix them with other applications. For example, let us assume that we have implemented two Scalding jobs and one Scala application, as shown:
class JobA (args: Args) extends Job (args) { /*pipeline*/ } class JobB (args: Args) extends Job (args) { /*pipeline*/ } object ScalaApp extends App { /*Application logic*/ }
To ensure that we execute the preceding tasks in a predefined order, we can implement a runner class. A runner class implemented in Scala should extend App
in order to work in the imperative programming style. This means that commands are executed synchronously and sequentially.
We can use this to our advantage and coordinate the execution of MapReduce tasks, other applications, and even external system commands, such as shown in the following example code:
object ExampleRunner extends App { val runnerArgs = Args(args) val configuration = new org.apache.hadoop.conf.Configuration...