Engineering Data with a Lakehouse
Data lakes are a relatively recent concept with the term being coined by James Dixon in 2010 while serving as Pentaho’s Chief Technology Officer (CTO). Dixon used the term data lake to differentiate the concept from data marts and data warehouses. Whereas data marts and data warehouses hold transformed and structured data, data lakes ingest both structured and unstructured data in their original forms.
Lakehouses are an even newer architecture that seeks to combine the benefits of a data lake and a data warehouse, providing a unified platform for all types of data at scale.
Although organizations that require or prefer full support for SQL, including read and write transactions, may opt for a data warehouse as an alternative to a lakehouse, many organizations are increasingly comfortable with performing data transformations in non-SQL languages and tools, and thus, the read-only SQL endpoint of lakehouses is not an impediment to adoption...