Chapter 8: Creating an End-to-End Machine Learning Workflow
In previous chapters, we learned about Pachyderm basics and how to install Pachyderm locally and on a cloud platform. We've deployed our first pipeline, learned how to update a pipeline, and performed some fundamental Pachyderm operations, such as splitting. I hope by now you are convinced that Pachyderm is an extremely versatile tool that gives you a lot of flexibility and power in managing your machine learning pipelines. To make it even more obvious, we will deploy a much more complex example than the ones that we have deployed so far. We hope this chapter will be especially fun for you to work on and will expand your understanding of data infrastructure quirks even more.
In this chapter, we will deploy a multistep Natural Language Processing (NLP) workflow that will demonstrate how to use Pachyderm at scale.
This chapter includes the following topics:
- NLP example overview
- Creating repositories and...