Apache Hive was developed at Facebook to primarily address the data warehousing requirements of the Hadoop platform. It was created to utilize analysts with strong SQL capabilities to run queries on the Hadoop cluster for data analytics. Although we often talk about going unstructured and using NoSQL, Apache Hive still fits in with today's information landscape regarding big data.
Apache Hive provides an SQL-like query language called HiveQL. Hive queries can be deployed on MapReduce, Apache Tez, and Apache Spark as jobs, which in turn can utilize the YARN engine to run programs. Just like RDBMS, Apache Hive provides indexing support with different index types, such as bitmap, on your HDFS data storage. Data can be stored in different formats, such as ORC, Parquet, Textfile, SequenceFile, and so on.
Hive querying also supports extended User Defined Functions...