Chapter 7: Data Aggregation and Group Operations
Data aggregation and group operations are very important methods in data analysis. These methods provide the ability to split data into a set of groups based on the specified key, and then apply some set of groupby
operations (aggregations or transformations) to the grouped data to produce a new set of values. The resulting values are then combined into a single data group.
This approach is popularly known as split-apply-combine. The term was actually coined by Hadley Wickham, the author of many popular R packages, to describe group operations. Figure 7.1 describes the idea of split-apply-combine graphically:
In this chapter, we look into ways of performing group operations: how to group data by column keys and perform data aggregation on grouped data jointly or independently.
This chapter will also show how to access grouped data by keys. It also gives insight into...