Design Principles for Creating Scalable and Resilient Pipelines
The true art of data engineering is the architecture of the pipeline design. This chapter deals with the implementation of the most effective design patterns and the top open source Python libraries used to create an enterprise-grade ETL pipeline. It illustrates how to install these libraries and has primers on all the functions available to create robust pipelines. This also explains all the design patterns and approaches available to create the ETL process.
By the end of this chapter, you will have an established workflow for building data pipelines within your local environment in the following ways:
- Understanding the design patterns for ETL pipelines
- Preparing your local environment for installations
- Open source Python libraries for ETL pipelines