Triggering jobs based on file arrival
File arrival triggers can be used to start your Databricks Workflow when new files show up in an external location such as Amazon S3 or Azure storage. This feature is useful when new data does not come regularly and a scheduled job would not work well. Every minute, file arrival triggers look for new files. They do not have any extra cost except for the cloud provider costs for listing files in the storage location.
In this recipe, you will learn how to set up file arrival triggers to run Databricks Workflows.
Getting ready
Before you can use file arrival triggers, you need to meet the following conditions:
- Your workspace should have Unity Catalog activated. Unity Catalog is a unified governance solution for data and AI assets on the lakehouse.
- An external location that belongs to the Unity Catalog metastore is recommended. This is an object that has both a cloud storage path and a storage credential that allows access to that...