Data Ingestion
Data ingestion is the process of getting all the raw data into the data lake. Data from various sources lands in the raw zone of the data lake. Based on where the data is coming from (such as on-premises systems, other cloud systems, and so on), you could use different ingestion tools. The following are some of the options available in Azure to ingest data:
- Azure Data Factory: You are already familiar with the ADF technology, which provides data ingestion support from hundreds of data sources, and even from other clouds such as Amazon Web Services (AWS), Google Cloud Platform (GCP), and Oracle. You will be using this again to build your pipeline as recommended in the syllabus.
- Azure Copy: As a command-line tool, Azure Copy (AzCopy) can be used to copy data over the internet and is ideally suited for smaller data sizes (preferably in the 10–15 TB range).
Note
You can learn more about AzCopy at https://packt.link/zAACw.
- Azure ExpressRoute...