Object Stores and Data Lakes
Enterprises have leaned heavily on databases and data warehouses for many decades. Around the turn of the millennium, the internet age was beginning to take hold. The proliferation of connected devices began to present a volume and variety of data that traditional databases and warehouses could no longer keep up with.
While developing a web indexing solution using this large influx of data, Google published a paper in 2003 titled the Google File System (GFS) that would shape industry solutions for the next two decades. This solution allowed for the development of data lakes, which led to lakehouses. Data lakes are a distributed file system that provide a cost-efficient method to store structured, unstructured, and semi-structured data. Lakehouses are a combination of data warehouses and data lake capabilities. We’re going to learn how to work with object stores, which are the foundational technology and storage for both data lakes and lakehouses...