ADLS for raw data ingestion
Before diving deeper into ingestion architectures, we need to introduce the fundamentals of data lakes, where the ingested data will land in the majority of cases.
A data lake can be seen as a mass storage with support for all kinds of data. It does not enforce specific file types or data types, which makes it a remarkably good landing zone for ingestion. The more rules that are enforced—as is the case in structured databases, for example—the likelier it becomes that data ingestion pipelines will break if the file type or schema changes.
On the Azure cloud, a data lake is a specific version of the Azure Storage account. Therefore, we will first introduce this service and its features.
Azure storage accounts
Azure storage accounts can be used to store all kinds of data objects. They provide four distinct types of storage, as follows:
- Binary Large Object (Blob) storage
- File storage
- Queue storage
- Table storage ...