Using Hive
With our Hive installation, we will now import and analyze the UFO data set introduced in Chapter 4, Developing MapReduce Programs.
When importing any new data into Hive, there is generally a three-stage process:
Create the specification of the table into which the data is to be imported.
Import the data into the created table.
Execute HiveQL queries against the table.
This process should look very familiar to those with experience with relational databases. Hive gives a structured query view of our data and to enable that, we must first define the specification of the table's columns and import the data into the table before we can execute any queries.
Note
We assume a general level of familiarity with SQL and will be focusing more on how to get things done with Hive than in explaining particular SQL constructs in detail. A SQL reference may be handy for those with little familiarity with the language, though we will make sure you know what each statement does, even if the details require...