Trident repartitioning operations
By performing repartitioning operations, a user can partition tuples across multiple tasks. The repartitioning operation doesn't make any changes to the content of the tuples. Also, the tuples will only pass over the network for the repartitioning operation. Here are the different types of repartitioning operation.
Utilizing shuffle operation
This repartitioning operation partitions the tuples in a uniform, random way across multiple tasks. This repartitioning operation is generally used when we want to distribute the processing load uniformly across the tasks. The following diagram shows how the input tuples are repartitioned using the shuffle
operation:
Here is a piece of code that shows how we can use the shuffle
operation:
mystream.shuffle().each(new Fields("a","b"), new myFilter()).parallelismHint(2)
Utilizing partitionBy operation
This repartitioning operation enables you to partition the stream on the basis of the fields in the tuples. For example, if...