Summary
This chapter gave a flavor of how the concept of the data lakehouse is implemented on a cloud computing platform. We started this chapter by delving into the question of why cloud computing is apt for implementing a data lakehouse. Then, we revisited the factors that propel cloud computing as the most optimal platform for implementing the data lakehouse architecture. The next section of the chapter focused on implementing the data lakehouse architecture on Microsoft Azure. We peeled back layer after layer and discussed the Azure services that you can use to realize each specific component.
We started with the data ingestion layer and discussed services such as Azure Data Factory and Event Hubs that enable batch and stream data ingestion. Next, we moved on to the data processing layer. We explored services such as Azure Databricks, ADF's data flows, Azure Data Explorer, and HDInsight that can be used to process batch and streaming data. Next, we focused on the data lake...