Joining DataFrames
A join operation is used to merge rows from two or more datasets by utilizing a shared column that establishes a relationship between them. You may already be familiar with the use and concept of joining, but it’s commonly used in any data processing tools such as SQL and other DataFrame libraries such as pandas and Spark.
In this recipe, we’ll look at how to apply join operations in Polars DataFrames.
Getting ready
We’ll continuously use the same data we’ve used in previous recipes in this chapter. Execute the following code to do the same process and rename the DataFrame accordingly:
from polars import selectors as cs academic_df = ( pl.read_csv('../data/academic.csv') .select( pl.col('year').alias('academic_year'), cs.numeric().cast(pl.Int64) ...