Using SQL for data transformations
SQL is an essential tool in data analysis and transformations. Many data pipelines you see at workplaces are written in SQL and most data professionals are accustomed to using SQL for analytics work.
The good news is that you can use SQL in Python Polars as well. This opens the door for those who might not be as familiar with a DataFrame library. In this recipe, we’ll cover how to configure Polars to use SQL and how you can implement simple SQL queries such as aggregations.
Getting ready
We’ll use the Contoso dataset for this recipe as well. Run the following code to read the dataset:
df = pl.read_csv('../data/contoso_sales.csv', try_parse_dates=True)
How to do it…
Here’s how to use SQL in Polars:
- Define the SQL context and register your DataFrame:
ctx = pl.SQLContext(eager=True) ctx.register('df', df)
- Create a simple query and execute it:
ctx.execute( ...