Definition of a data lake
It's a good time to be alive. We have a tremendous amount of information available with just a few keystrokes (thank you, Google) or a simple voice command (thank you, Alexa).
The data that companies are generating is richer than ever before. The amount they are generating is growing at an exponential rate. Fortunately, the processing power needed to harness this deluge of data is ever increasing and becoming cheaper. Cloud technologies such as AWS allow us to scale data almost instantaneously and in a massive fashion:
Data is everywhere today. It was always there, but it was too expensive to keep it. With the massive drops in storage costs, enterprises are keeping much of what they were throwing away before. And this is the problem.
Many enterprises are collecting, ingesting, and purchasing vast amounts of data but are struggling to gain insights from it...