Creating DAGs
The core concept of Airflow is based on DAGs, which collect, group, and organize tasks to be executed in a specific order. A DAG is also responsible for managing the dependencies between its tasks. Simply put, it is not concerned about what a task is doing but just how to execute it. Typically, a DAG starts at a scheduled time, but we can also define dependencies between other DAGs so that they will start based on their execution statuses.
We will create our first DAG in this recipe and set it to run based on a specific schedule. With this first step, we enter into practically designing our first workflow.
Getting ready
Please refer to the Getting ready section in the Configuring Airflow recipe for this recipe since we will handle it with the same technology.
Also, let’s create a directory called ids_ingest
inside our dags
folder. Inside the ids_ingest
folder, we will create two files: __init__.py
and ids_ingest_dag.py
. The final structure will look...