Organizing your data lake
A well-structured system of zones/layers and folders will help you control your data lake. On the one hand, you will find a canonical approach that makes it easier to understand structures and the semantics behind these zones and folders. On the other hand, generator approaches will enable you to automate processes in your modern data warehouse.
Many Big Data projects suffer from poorly organized folder structures, and it becomes a challenge to find the right data for the right analysis at the right time. The so-called data swamp can be nearly impossible to use and will even demotivate users from leveraging the effort that must be put into it.
Talking about zones in your data lake
In Chapter 1, Balancing the Benefits of Data Lakes over Data Warehouses, we addressed the question of zones in a data lake. We compared them to the layers in a data warehouse and found that they are pretty similar, and mostly follow similar semantics: