Performance utilities
Hive provides the EXPLAIN
and ANALYZE
statements that can be used as utilities to check and identify the performance of queries.
The EXPLAIN statement
Hive provides an
EXPLAIN
command to return a query execution plan without running the query. We can use an EXPLAIN
command for queries if we have a doubt or a concern about performance. The EXPLAIN
command will help to see the difference between two or more queries for the same purpose. The syntax for EXPLAIN
is as follows:
EXPLAIN [EXTENDED|DEPENDENCY|AUTHORIZATION] hive_query
The following keywords can be used:
EXTENDED
: This provides additional information for the operators in the plan, such as file pathname and abstract syntax tree.DEPENDENCY
: This provides a JSON format output that contains a list of tables and partitions that the query depends on. It is available since HIVE 0.10.0.AUTHORIZATION
: This lists all entities needed to be authorized including input and output to run the Hive query and authorization failures...