Group By
One of the most fundamental tasks during data analysis involves splitting data into independent groups before performing a calculation on each group. This methodology has been around for quite some time, but has more recently been referred to as split-apply-combine.
Within the apply step of the split-apply-combine paradigm, it is additionally helpful to know whether we are trying to perform a reduction (also referred to as an aggregation) or a transformation. The former reduces the values in a group down to one value whereas the latter attempts to maintain the shape of the group.
To illustrate, here is what split-apply-combine looks like for a reduction:
![](https://static.packt-cdn.com/products/9781836205876/graphics/Images/B31091_08_01.png)
Figure 8.1: Split-apply-combine paradigm for a reduction
Here is the same paradigm for a transformation:
![](https://static.packt-cdn.com/products/9781836205876/graphics/Images/B31091_08_02.png)
Figure 8.2: Split-apply-combine paradigm for a transformation
In pandas, the pd.DataFrame.groupby
method is responsible for splitting, applying a function of your choice, and combining...