Turning columns into rows
Unpivoting is a common operation done when reshaping or tidying your data. It helps turn and compact columns into rows. In other words, the unpivot operation converts your data from a wide format to a long format. This operation is also known as unpivot.
Both wide format and long format have their use cases. Some client tools prefer wide format while some prefer long format.
In this recipe, we’ll cover how to turn columns into rows using the unpivot operation.
How to do it...
Here’s how to turn columns into rows:
- Use the
.unpivot()
method to turn columns into rows:long_df = df.unpivot( index='academic_year', on=[ 'students', 'us_students', 'undergraduate', 'graduate', 'non_degree', 'opt' ], variable_name='student_type', value_name='count' ) long_df.head()
The preceding code will return the following output:
Figure 8.2 – The DataFrame in a long format
- Let’s check the
student_type
column to make sure it contains the values we expect:long_df.select('student_type').unique()
The preceding code will return the following output:
Figure 8.3 – Unique values in the student_type column
- Utilize selectors in Polars to select columns at once for the same unpivot operation. This time, we won’t change the variable and value names:
df.unpivot( index='academic_year', on=cs.numeric() ).head()
The preceding code will return the following output:
Figure 8.4 – The DataFrame in a long format without renaming new columns
- Apply the unpivot operation in a LazyFrame.
Let’s say you had a LazyFrame instead of a DataFrame:
lf = df.lazy()
You can still apply the unpivot operation in a LazyFrame:
( lf .unpivot( index='academic_year', on=cs.numeric(), variable_name='student_type', value_name='count' ) .collect() .head() )
The preceding code will return the same output as Figure 8.4.
How it works...
Using the .unpivot()
method is straightforward. You specify the columns to keep and the columns that get turned into rows.
In step 3, we didn’t specify the variable and value names. In that case, Polars automatically names those columns variable
and value
, respectively.
Also, as you know, many DataFrame methods are available in a LazyFrame; the .unpivot()
method is one of them, which is great news.
See also
Feel free to refer to these resources to learn more about turning columns into rows: