Reshaping the DataFrame
A DataFrame that consists of a combination of categorical and numeric columns can be expressed in both wide and long formats. For example, the students
DataFrame is considered a long format since all countries are stored in the country
column. Depending on the specific purpose of processing, we may want to create a separate column for each unique country in the dataset, which adds more columns to the DataFrame and converts it into a wide format.
Converting between wide and long formats can be achieved via the spread()
and gather()
functions, both of which are provided by the tidyr
package from the tidyverse
ecosystem. Let’s see how it works in practice.
Converting from long format into wide format using spread()
There will be times when we’ll want to turn a long-formatted DataFrame into a wide format. The spread()
function can be used to convert a categorical column with multiple categories into multiple columns, as specified by the key...