Storing and processing Hive data in the ORC file format
I'm sure that most of the time, you would have created Hive tables and stored data in a text format; in this recipe, we are going store data in ORC files.
Getting ready
To perform this recipe, you should have a running Hadoop cluster as well as the latest version of Hive installed on it. Here, I am going to use Hive 1.2.1.
How to do it...
Hive 1.2.1 supports different types of files, which help process data in a fast manner. In this recipe, we are going to use ORC files to store data in Hive. To store the data in ORC files, we first need to create a Hive table that stores the data in a textual format. We will use the same table that we created in the first recipe.
Creating a table to store ORCFILE
is very easy, as shown here:
create table employee_orc( id int, name string) row format delimited fields terminated by '|' stored as ORC;
To insert data into the table from our text table, we need to execute the following query, which would start...