Using group by aggregations
Group by aggregations are essential in data analysis and involve dividing a dataset into distinct groups based on categorical values and, subsequently, applying aggregate functions to each group.
This technique is particularly useful for obtaining summary statistics and insights within subsets of the data. This not only simplifies the analysis process but also provides a more nuanced understanding of the underlying patterns and trends within the data.
In this recipe, we’ll cover how to group your DataFrame and LazyFrame and apply aggregations to each group.
Getting ready
Make sure to read the Contoso sales dataset:
df = pl.read_csv('../data/contoso_sales.csv', try_parse_dates=True)
How to do it...
Here’s how you can use group by aggregations:
- Group your DataFrame by a column called
Brand
:df.group_by('Brand')
The preceding code will return the following output:
>> <polars.dataframe.group_by...