Summary
In this chapter, we looked at pipeline orchestration, which is a key component of data engineering. We looked at various options – both open source and paid – that should allow you to evaluate the solution that works best for your data engineering needs. We looked at Airflow and Argo, which are open source tools that are quite popular among developers. We then looked at Databricks Workflows as well as ADF, which are managed solutions and provide a lot of functionalities and seamless integration with other services running in the cloud.
In the next chapter, we are going to look at performance tuning, which is extremely important for ensuring your data engineering workloads run efficiently and are cost effective.