Grouping and summarizing
The same logic used to present the slicing and filtering concepts can be applied here too: we will never go row by row, analyzing one observation at a time.
We need a better way to look at the data, one that makes it smaller and easier to understand. To do that, we can aggregate data, creating groups of observations and putting each one of them in a separate and labeled box. This is grouping.
After that, we have groups, but we still don’t have a very good use for n boxes that we don’t know the contents of, besides the name of the group on the label. Summarization will do that job by taking the observations in each box and wrapping them up with a single number, which could be the mean, the median, or the total. Summarization is, therefore, reducing observations to one number.
Given these definitions, it is reasonable to say that summarization is complementary to the grouping function since we first aggregate the data in groups and then...