Section 1: Introduction to Pachyderm and Reproducible Data Science
This section introduces the basics of Pachyderm, as well as describing the importance of data reproducibility for an enterprise-level data science platform. You will learn what the main pillars of the Pachyderm solution are, including repositories, datums, jobs, and the most important of them all – the pipeline. The chapter also briefly talks about the ethics of AI in terms of reproducibility.
This section comprises the following chapters:
- Chapter 1, The Problem of Data Reproducibility
- Chapter 2, Pachyderm Basics
- Chapter 3, Pachyderm Pipeline Specification