Data aggregation
As a final topic, we will look at ways to get a condensed view of data with aggregations. Pandas comes with a lot of aggregation functions built-in. We already saw the describe
function in Chapter 3, Data Analysis with Pandas. This works on parts of the data as well. We start with some artificial data again, containing measurements about the number of sunshine hours per city and date:
>>> df.head() country city date hours 0 Germany Hamburg 2015-06-01 8 1 Germany Hamburg 2015-06-02 10 2 Germany Hamburg 2015-06-03 9 3 Germany Hamburg 2015-06-04 7 4 Germany Hamburg 2015-06-05 3
To view a summary per city
, we use the describe
function on the grouped data set:
>>> df.groupby("city").describe() hours city Berlin count 10.000000 mean 6.000000 std 3.741657 min 0.000000 25% 4.000000 50% 6.000000 75...