Grouping and calculating multiple columns
Now that we have the basics down, let’s take a look at a pd.DataFrame
that contains more columns of data. Generally, your pd.DataFrame
objects will contain many columns with potentially different data types, so knowing how to select and work with them all through the context of pd.core.groupby.DataFrameGroupBy
is important.
How to do it
Let’s create a pd.DataFrame
that shows the sales
and returns
of a hypothetical widget
across different region
and month
values:
df = pd.DataFrame([
["North", "Widget A", "Jan", 10, 2],
["North", "Widget B", "Jan", 4, 0],
["South", "Widget A", "Jan", 8, 3],
["South", "Widget B", "Jan", 12, 8],
["North", "Widget A", "Feb", 3, 0],
["North", "Widget B", "Feb", 7, 0],
["South"...