Performing operations on grouped data in a DataFrame
One of the great features of pandas DataFrames is the ability to group the data by the values in particular columns. For example, we might group assembly line data by the line ID and the shift ID. The ability to operate on this grouped data ergonomically is very important since data is often aggregated for analysis but needs to be grouped for preprocessing.
In this recipe, we will learn how to perform operations on grouped data in a DataFrame
. We’ll also take the opportunity to show how to operate on rolling windows of (grouped) data.
Getting ready
For this recipe, we will need the NumPy library imported as np
, the Matplotlib pyplot
interface imported as plt
, and the pandas library imported as pd
. We’ll also need an instance of the default random number generator created as follows:
rng = np.random.default_rng(12345)
Before we start, we also need to set up the Matplotlib plotting settings to change the...