Partitioning
As discussed in Chapter 4, Scalability, Low Latency, and Performance, stateless partitioning of a single operator can be accomplished by setting the PARTITIONER
attribute. For the current example, we could partition CSVParser
using the following configuration stanza:
<property> <name>apex.operator.CSVParser.attr.PARTITIONER</name> <value>com.datatorrent.common.partitioner.StatelessPartitioner:2</value></property>
If we want some section of the SQL pipeline to be partitioned in parallel, we can set the PARTITION_PARALLEL
attribute on the input ports of the downstream operators in that section, as shown in this example:
<property> <name>apex.operator.LogicalFilter_1.inputport.input.attr.PARTITION_PARALLEL </name> <value>true</value></property>
With these changes, the physical DAG of our application would look like this:
The application's physical DAG