Chapter 3: Pachyderm Pipeline Specification
A Machine Learning (ML) pipeline is an automated workflow that enables you to execute the same code continuously against different combinations of data and parameters. A pipeline ensures that every cycle is automated and goes through the same sequence of steps. Like in many other technologies, in Pachyderm, an ML pipeline is defined by a single configuration file called the pipeline specification, or the pipeline spec.
The Pachyderm pipeline specification is the most important configuration in Pachyderm as it defines what your pipeline does, how often it runs, how the work is spread across Pachyderm workers, and where to output the result.
This chapter is intended as a pipeline specification reference and will walk you through all the parameters you can specify for your pipeline. To do this, we will cover the following topics:
- Pipeline specification overview
- Understanding inputs
- Exploring informational parameters ...