The legacy Source API and the Read transform
Before the creation of the splittable DoFn
object, Beam used the Source API and its associated Read
transform. Although this transform is currently deprecated and should not be used for implementing new sources, it is still supported. On some runners and under specific conditions, using the deprecated Read
transform might still be preferred. We have already seen examples of this – for example, the use_deprecated_read
flag passed when using the --experiments
flag for Python's ReadFromKafka
transform.
The Read
transform accepts a single parameter: either an object of the BoundedSource
type or the UnboundedSource
type. Whether the source is bounded or unbounded then determines if the resulting PCollection
object is bounded or unbounded.
We apply the Read
transform as follows:
Pipeline p = ...; p.apply(Read.from(new MyUnboundedSource());
We will not go into the details of BoundedSource
or UnboundedSource
, mostly because...