Summary
It is interesting to note how the term data lake came about. It is not called a pond as a pond is perceived to be small. It is not called a sea or ocean because the saltwater makes it look murky and the waves are rough and uncontrolled. It is not called a stream as "streaming" is already heavily used in the context of real-time processing. It is not a river because water drains off, whereas the vision of a data lake is that of a pristine reservoir of water that provides food and shelter to a lot of flora and fauna and could turn into a swamp if you're not careful with governance and management. In this chapter, we went over the need for data consolidation and how Delta helps with data reliability, quality, and governance, giving us curated analytics-ready data and preventing silos and swamps. Data, once curated, remains in an open format and is used in multiple use cases by different data personas, enabling them to be more agile in on-boarding new use cases and...