The probability that the organizations developing and operating big data applications already have a Hadoop cluster deployed is very high. Also, there is a high possibility that they also have real-time stream processing applications deployed to go along with the batch applications running on Hadoop.
It would be great if we can leverage the already deployed YARN cluster to also run the Storm topologies. This will reduce the operational cost of maintenance by giving you only one cluster to manage instead of two.
Storm-YARN is a project developed by Yahoo! that enables the deployment of Storm topologies over YARN clusters. It enables the deployment of Storm processes on nodes managed by YARN.
The following diagram illustrates how the Storm processes are deployed on YARN:
In the next section, we will see how to set...