Describing the anatomy of an Apache Beam runner
Let's first take a look at the typical life cycle of a pipeline, from the construction time to the pipeline teardown. The complete life cycle is illustrated in the following figure:
The pipeline construction is already well known – we spent most of this book showing how to construct and test pipelines. The next step is submitting the pipeline to a runner. This is the point where the pipeline crosses the SDK-runner boundary, typically by a call to Pipeline.run()
.
After the pipeline is submitted to the runner, the runner proceeds as follows:
- Once a runner receives a pipeline, it first performs pipeline validation. This consists of various runner-independent validations – for instance, validating that an appropriate window function and/or trigger is being set and depending on the boundedness of the inputs of the pipeline. These...