Replicating pivot_table with a groupby aggregation
At first glance, it may seem that the pivot_table
method provides a unique way to analyze data. However, after a little massaging, it is possible to replicate its functionality exactly with a groupby
aggregation. Knowing this equivalence can help shrink the universe of pandas functionality.
Getting ready
In this recipe, we use the flights
dataset to create a pivot table and then recreate it using groupby
operations.
How to do it...
- Read in the flights dataset, and use the
pivot_table
method to find the total number of canceled flights per origin airport for each airline:
>>> flights = pd.read_csv('data/flights.csv') >>> fp = flights.pivot_table(index='AIRLINE', columns='ORG_AIR', values='CANCELLED', aggfunc='sum', fill_value=0).round(2) >>> fp.head()
- A
groupby
aggregation cannot directly replicate this...