Mounting Azure Data Lake in Databricks
Azure Data Lake (storage account) is a limitless data lake service managed by Microsoft. It allows the storage and analysis of big data of all forms: structured, semi-structured, and unstructured. It is built to accommodate big data analytics by integrating with Hadoop and Spark. We will be using Azure Data Lake quite often in this book. So, it is important for us to learn its integration with Azure Databricks.
Creating an Azure Data Lake instance
Now, we will create an Azure Data Lake instance and use it as our primary data store in the Azure environment:
- Go to Azure's website (portal.azure.com) and sign in to the portal. Open an existing resource group, then click on Create and then on Marketplace. Search for
Storage account
. The following window opens up: - Click on Create. Set the Resource group, Storage account name, and Region. Set Performance to Standard: Recommended...