Analyzing two variables using a pivot table
A pivot table summarizes our dataset by grouping and aggregating variables within the dataset. Some of the aggregation functions within the pivot table include a sum, count, average, minimum, maximum, and so on. For bivariate analysis, the pivot table can be used for categorical-numerical variables. The numerical variable is aggregated for each category in the categorical variable.
The name pivot table has its origin in spreadsheet software. The summary provided by a pivot table can easily uncover meaningful insights from a large dataset.
In this recipe, we will explore how to create a pivot table in pandas
. The pivot_table
method in pandas
can be used for this.
Getting ready
We will work with the Palmer Archipelago (Antarctica) penguin data from Kaggle in this recipe. You can retrieve all the files from the GitHub repository.
How to do it…
We will learn how to create a pivot table using the pandas
library:
-
...