Putting Everything Together with Airflow
So far, we have covered the different aspects and steps of data ingestion. We have seen how to configure and ingest structured and unstructured data, what analytical data is, and how to improve logs for more insightful monitoring and error handling. Now is the time to group all this information to create something similar to a real-world project.
From now on, in the following chapters, we will use Apache Airflow, an open source platform that allows us to create, schedule, and monitor workflows. Let’s start our journey by configuring and understanding the fundamental concepts of Apache Airflow and how powerful this tool is.
In this chapter, you will learn about the following topics:
- Configuring Airflow
- Creating DAGs
- Creating custom operators
- Configuring sensors
- Creating connectors in Airflow
- Creating parallel ingest tasks
- Defining ingest-dependent DAGs
By the end of this...