The following is a representation of Hive architecture:
The preceding diagram shows that Hive architecture is divided into three parts—that is, clients, services, and metastore. The Hive SQL is executed as follows:
- Hive SQL query: A Hive query can be submitted to the Hive server using one of these ways: WebUI, JDBC/ODBC application, and Hive CLI. For a thrift-based application, it will provide a thrift client for communication.
- Query execution: Once the Hive server receives the query, it is compiled, converted into an optimized query plan for better performance, and converted into a MapReduce job. During this process, the Hive Server interacts with the metastore for query metadata.
- Job execution: The MapReduce job is executed on the Hadoop cluster.