Using Delta Lake
When using Databricks, we can also use its open source storage layer, Delta Lake. It is a database engine that brings lots of benefits to data lake storage. Here are a few of them:
- Acid transactions: It adds serializability and an isolation level to concurrent reads and writes of data.
- Time Travel and Audit of History: Adds snapshots that enable reversion to a previous version of the data. This is useful when we want to see what happened to our data. With the Delta Lake engine, we can see the state of the data at any time in its history.
- Updates and Deletes: Usually, these data manipulation languages (DMLs) are impossible with other big data technologies. The Delta Lake engine supports them and even adds the Merge command on top of them.
- Compatible with the Apache Spark API: Can be used in existing Spark data code without many changes.
For a complete list of features, go to the following URL:
The Delta Lake engine...