Replicating pivot_table with a groupby aggregation
At first glance, it may seem that the .pivot_table
method provides a unique way to analyze data. However, after a little massaging, it is possible to replicate its functionality with the .groupby
method. Knowing this equivalence can help shrink the universe of pandas functionality.
In this recipe, we use the flights dataset to create a pivot table and then recreate it using the .groupby
method.
How to do it…
- Read in the flights dataset, and use the
.pivot_table
method to find the total number of canceled flights per origin airport for each airline:>>> flights = pd.read_csv('data/flights.csv') >>> fpt = flights.pivot_table(index='AIRLINE', ... columns='ORG_AIR', ... values='CANCELLED', ... aggfunc='sum', ... fill_value=0) >>> fpt ORG_AIR ATL DEN DFW IAH LAS LAX MSP ORD PHX SFO AIRLINE AA ...