Learning about AQE
We already know how Spark works under the hood. Whenever we execute transformations, Spark prepares a plan, and as soon as an action is called, it performs those transformations. Now, it's time to expand that knowledge. Let's dive deeper into Spark's query execution mechanism.
Every time a query is executed by Spark, it is done with the help of the following four plans:
- Parsed Logical Plan: Spark prepares a Parsed Logical Plan, where it checks the metadata (table name, column names, and more) to confirm whether the respective entities exist or not.
- Analyzed Logical Plan: Spark accepts the Parsed Logical Plan and converts it into what is called the Analyzed Logical Plan. This is then sent to Spark's catalyst optimizer, which is an advanced query optimizer for Spark.
- Optimized Logical Plan: The catalyst optimizer applies further optimizations and comes up with the final logical plan, called the Optimized Logical Plan.
- Physical...