Building and executing a data pipeline using Airflow
In the preceding section, you have built your data pipeline to ingest and process data. Imagine that new flights
data is available once a week and you need to process the new data repeatedly. One way is to run the data pipeline manually; however, this approach may not scale as the number of data pipelines grows. Data engineers' time would be used more efficiently in writing new pipelines instead of repeatedly running the old ones. The second concern is security. You may have written the data pipeline on sample data and your team may not have access to production data to execute the data pipeline.
Automation provides the solution to both problems. You can schedule your data pipelines to run as required while the data engineer works on more interesting work. Your automated pipeline can connect to production data without any involvement from the development team, which will result in better security.
The ML platform contains...