Understanding the ETL Process and Data Pipelines
With a firm foundation of Python under our belts and a clean development environment established, we can now pivot to talking about the fundamentals of data pipelines.
Within this chapter, we will define what a data pipeline is, as well as take a more in-depth look at the process of building robust pipelines. We will then discuss different approaches, such as the Extract, Transform, and Load (ETL) and Extract, Load, and Transform (ELT) methodologies, and how they tie into effectively automating data movement.
By the end of this chapter, you will have an established workflow for building data pipelines within your local environment and will have covered the following topics:
- What is a data pipeline?
- Creating robust data pipelines
- What is an ETL pipeline? How do ETL pipelines differ from ELT pipelines?
- Automating ETL pipelines
- Examples of use cases of ETL pipelines