A physical view of a Storm cluster
The next figure explains the physical position of each process. There can be only one Nimbus. However, more than one Zookeeper is there to support failover, and per machine, there is one supervisor.
Stream grouping
A stream grouping controls the flow of tuples between from spout to bolt or bolt to bolt. In Storm, we have four types of groupings. Shuffle and field grouping are most commonly used:
- Shuffle grouping: Tuple flow between two random tasks in this grouping
- Field grouping: A tuple with a particular field key is always delivered to the same task of the downstream bolt
- All grouping: Sends the same tuple to all tasks of the downstream bolt
- Global grouping: Tuples from all tasks reach one task
The subsequent figure gives a diagrammatic explanation of all the four types of groupings:
Fault tolerance in Storm
Supervisor runs a synchronization thread to get assignment information (what part of topology I am supposed to run) from Zookeeper and write to the local...