Setting up workers and parallelism to enhance processing
Storm is a highly scalable, distributed, and fault tolerant real-time parallel processing compute framework. Note that the emphasis is on scalability, distributed, and parallel processing—well, we already know that Storm operates in clustered mode and is therefore distributed in its basic nature. Scalability was covered in the previous section; now, let's have a closer look at parallelism. We introduced you to this concept in an earlier chapter, but now we'll get you acquainted with how to tweak it to achieve the desired performance. The following points are the key criteria for this:
A topology is allocated a certain number of workers at the time it's started.
Each component in the topology (bolts and spouts) has a specified number of executors associated with it. These executors specify the number or degree of parallelism for each running component of the topology.
The whole efficiency and speed factor of Storm are driven by the parallelism...