Data lake patterns
There are two types of data lake patterns, as follows:
- Centralized pattern
- Distributed pattern
Let’s discuss each of them. Note that you can use a hybrid pattern too, depending on your use case.
Centralized pattern
In a centralized pattern, the business data is stored and accessed from a central location, to be used throughout the enterprise. For example, it may be easy to manage entity information in a centralized location; entity information such as name, address, gender, age, and profession of a person. It’s easier to manage such datasets in a centralized way, from a governance point of view as well as to avoid data duplication.
Certain LOBs may have additional properties of the data that are relevant only to their use cases. For example, the marketing department may also want to see customer lifetime value (CLV), net promoter score (NPS), marketing preferences, and so on for a person. These additional attributes can then...