Delta table data format
In this recipe, you will learn how data is stored in Delta tables and how Delta tables keep track of the history of all DML operations.
Delta Lake's transaction log (the DeltaLog) records every transaction that has been performed on a Delta table since it was created. When a user performs a DML operation, Delta Lake breaks down the operation into multiple steps that each have one or more actions associated with them, as follows:
- Add file – Adds a data file to Delta Lake
- Remove file – Removes a data file
- Update metadata – Updates the table metadata including changing the table name adding partitioning
- Set transaction – Records the commit using a Structured Streaming micro-batch job with a given ID
- Change protocol – Enables features by switching the Delta Lake transaction log to the newest software protocol
- Commit info – Contains information around the commit and the details about the...