Selection with a MultiIndex – A single level
A pd.MultiIndex
is a subclass of a pd.Index
that supports hierarchical labels. Depending on who you ask, this can be one of the best or one of the worst features of pandas. After reading this cookbook, I hope you consider it one of the best.
Much of the derision toward the pd.MultiIndex
comes from the fact that the syntax used to select from it can easily become ambiguous, especially when using pd.DataFrame[]
. The examples below exclusively use the pd.DataFrame.loc
method and avoid pd.DataFrame[]
to mitigate confusion.
How to do it
pd.MultiIndex.from_tuples
can be used to construct a pd.MultiIndex
from a list of tuples. In the following example, we create a pd.MultiIndex
with two levels – first_name
and last_name
, sequentially. We will pair this alongside a very simple pd.Series
:
index = pd.MultiIndex.from_tuples([
("John", "Smith"),
("John", "Doe"),
("...