Loading data into Hive
In this recipe, we look at how we can import data into Hive and also how we can point it to existing data using an external table.
The data store formats for Hive can be text, ORC and parquet, as well as a few other formats. Each one has its advantages in terms of compression, performance, space utilization and memory overheads.
Getting ready
To progress through the recipe, you must have completed the recipe Using MySQL for Hive metastore. There are many examples of each type of Hive distribution at $HIVE_HOME/examples
.
How to do it...
Connect to the edge node
edge1.cyrus.com
in the cluster and switch to thehadoop
user.Connect by either using Hive or the beeline client and import the data by creating a table as shown in the following screenshot:
Now take a look at the HDFS warehouse location. You will see a file named
kv1.txt
copied there, as shown in the following screenshot:Describe the table
pokes
and look at the data, as shown in the following screenshot. What if you...