Task 22 – A non-I/O application of splittable DoFn – PiSampler
Though splittable DoFn
shows most of its strengths when providing inputs to pipelines, it has other interesting use cases as well. In this task, we will investigate one of them: a Monte Carlo method for estimating the value of Pi. Although this is not an efficient algorithm for estimating the value of Pi, it is simple enough to provide a good example of a splittable DoFn
use case. The approach that we will investigate can be extended to other similar use cases such as Gibbs sampling, which might have better practical applications.
As always, let's start by defining our problem.
The problem definition
Create a Monte Carlo method (see Figure 7.5) for estimating the value of Pi. Use splittable DoFn
to support distributed computation, specifying the (ideal) target parallelism and the number of samples drawn in each parallel worker.
As part of the problem definition, we will define the Monte Carlo...