Understanding inputs
We described inputs in Chapter 2, Pachyderm Basics, in detail by providing examples. Therefore, in this section, we'll just mention that inputs define the type of your pipeline. You can specify the following types of Pachyderm inputs:
- PFS is a generic parameter that defines a standard pipeline and inputs in all multi-input pipelines.
- Cross is an input that creates a cross-product of the datums from two input repositories. The resulting output will include all possible combinations of all datums from the input repositories.
- Union is an input that adds datums from one repository to the datums in another repository.
- Join is an input that matches datums with a specific naming pattern.
- Spout is an input that consumes data from a third-party source and adds it to the Pachyderm filesystem for further processing.
- Group is an input that combines datums from multiple repositories based on a configured naming pattern.
- Cron is a pipeline...