Reading and writing Delta Lake tables
Delta Lake is an open source storage layer built on top of the Parquet format. Delta Lake has more features than the Parquet format such as versioning and ACID guarantees. It’s basically a Parquet file with some additional benefits.
Many data pipelines nowadays are built in lakehouse architecture, which is a mix of data lakes and warehouses. Delta Lake table is a popular option and is used by many companies. Delta Lake tables can essentially be stored in your data lake but also be queried and used like relational tables. So, Polars being able to work with Delta Lake tables is a big plus.
In this recipe, we’ll look at how to read and write Delta Lake tables with a few useful parameters.
Getting ready
This recipe requires you to install another Python library, deltalake
. It’s a dependency required for Polars to work with Delta Lake tables. Run the following command to install it in your Python environment:
>...