Reshaping with pd.DataFrame.explode
The world would be so simple if every piece of data fitted perfectly as a scalar into a two-dimensional pd.DataFrame
. Alas, life is not so simple. Especially when working with semi-structured sources of data like JSON, it is not uncommon to have individual items in your pd.DataFrame
contain non-scalar sequences like lists and tuples.
You may find it acceptable to leave data in that state, but other times, there is value to normalizing the data and potentially extracting out sequences contained within a column into individual elements.
Figure 7.7: Using pd.DataFrame.explode to extract list elements to individual rows
To that end, pd.DataFrame.explode
is the right tool for the job. It may not be a function you use every day, but when you eventually need to use it, you will be happy to have known about it. Attempting to replicate the same functionality outside of pandas can be error-prone and non-performant!
How to do it
Since...