Creating a crosstab/two-way table on bivariate data
A crosstab displays the relationship between categorical variables in a matrix format. The rows of the crosstab are typically the categories within the first categorical variable, while the columns are the categories of the second categorical variable. The values within the crosstab are either the frequency of occurrence or the percentage of occurrence. It is also known as a two-way table or contingency table. With the crosstab, we can easily uncover trends and patterns, especially as they relate to specific categories within our dataset.
In this recipe, we will explore how to create crosstabs in pandas
. The crosstab
method in pandas
can be used for this.
Getting ready
We will work with the Palmer Archipelago (Antarctica) penguin data from Kaggle in this recipe. You can retrieve all the files from the GitHub repository.
How to do it…
We will learn how to create a crosstab using the pandas
library:
- Import...