A high-level view of various components of Storm
In this section, we will get you acquainted with various components of Storm, their role, and their distribution in a Storm cluster.
A Storm cluster has three sets of nodes (which could be co-located, but are generally distributed in clusters), which are as follows:
- Nimbus
- Zookeeper
- Supervisor
The following figure shows the integration hierarchy of these nodes:
The detailed explanation of the integration hierarchy is as follows:
- Nimbus node (master node, similar to Hadoop-JobTracker): This is the heart of the Storm cluster. You can say that this is the master daemon process that is responsible for the following:
- Uploading and distributing various tasks across the cluster
- Uploading and distributing the topology jars jobs across various supervisors
- Launching workers as per ports allocated on the supervisor nodes
- Monitoring the topology execution and reallocating workers whenever necessary
- Storm UI is also executed on the same node
- Zookeeper nodes: Zookeepers can be designated as the bookkeepers in the Storm cluster. Once the topology job is submitted and distributed from the Nimbus nodes, then even if Nimbus dies the topology would continue to execute because as long as Zookeepers are alive, the workable state is maintained and logged by them. The main responsibility of this component is to maintain the operational state of the cluster and restore the operational state if recovery is required from some failure. It's the coordinator for the Storm cluster.
- Supervisor nodes: These are the main processing chambers in the Storm topology; all the action happens in here. These are daemon processes that listen and manage the work assigned. These communicates with Nimbus through Zookeeper and starts and stops workers according to signals from Nimbus.