Data aggregation is the process of grouping data based on some meaningful categories of the information. Analysis is then performed on each of the groups to report one or more summary statistics for each. This summarization in this sense is a general term in that summarization can literally be a summation (such as total number of units sold) or statistical calculation such as a mean or standard deviation.
This chapter will examine the facilities of pandas to perform data aggregation. This includes a powerful split-apply-combine pattern for grouping, performing group-level transformations and analyses, and reporting the results from every group within a summary pandas object. Within this framework, we will examine several techniques of grouping data, applying functions on a group level, and being able to filter data in or out of the analysis.
Specifically, in this...