Azure Data Lake (ADL) is Microsoft's storage and analytics service for big data. It is capable of storing data on a petabyte scale and making efficient queries on the stored data. The storage and the analytics services are separate in Azure and the ADL service actually consists of two different products: Azure Data Lake Storage (ADLS) and Azure Data Lake Analytics (ADLA). In this section, we will focus on ADLA, but we will also touch on ADLS where appropriate.
Data Lake Storage is a file-based storage, with files organized into directories. This type of storage is called schemaless, since there are no constraints on what type of data can be stored in the Data Lake. Directories can contain text files and images, and the data type is specified only when the data is read out from the Data Lake. This is particularly useful in big data scenarios where...