Chapter 10: Designing and Implementing a Data Lake Using Azure Storage
In Chapter 9, Data Vault Modeling, you learned how to design a Data Vault data warehouse. It is a flexible and scalable data warehouse. The flexibility refers to the fact that it is agile: it can adapt easily to different circumstances, such as new source databases or other reporting requirements. It is, however, a relational database implementation. Relational databases are good at handling structured data.
What if you have data in JSON documents stored in Cosmos DB? What will you do with web logs, error logs, and other semi- or non-structured data that is also interesting to analyze? Are you going to transform that data into table structures?
Instead of combining with a Data Vault data warehouse, you might implement a data lake. This chapter teaches you the why and the how of designing and implementing a data lake. You will learn about the following topics:
- Background of data lakes
- Modeling a...