Understanding the working of Airflow
Airflow handles all three of the preceding elements using Python scripts. As data engineers, what we need to do is to create code in Python for handling task dependencies, scheduling our jobs, and integrating with other systems. This is different from traditional extract, transform, load (ETL) tools. If you have ever heard of or used tools such as Control-M, Informatica, Talend, or many other ETL tools, Airflow has the same positioning as these tools. The difference is that Airflow is not a user interface (UI)-based drag and drop tool. Airflow is designed for you to write the workflow using code.
There are a couple of good reasons why managing the workflow using code is a better idea than using drag-and-drop tools. Here’s why we should do this:
- Using code, you can automate a lot of development and deployment processes
- Using code, it’s easier for you to enable good testing practices
- All the configurations can be managed...