Time for action – creating a table from an existing file
So far we have loaded data into Hive directly from files over which Hive effectively takes control. It is also possible, however, to create tables that model data held in files external to Hive. This can be useful when we want the ability to perform Hive processing over data written and managed by external applications or otherwise required to be held in directories outside the Hive warehouse directory. Such files are not moved into the Hive warehouse directory or deleted when the table is dropped.
Save the following to a file called
states.hql
:CREATE EXTERNAL TABLE states(abbreviation string, full_name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/tmp/states' ;
Copy the data file onto HDFS and confirm its presence afterwards:
$ hadoop fs -put states.txt /tmp/states/states.txt $ hadoop fs -ls /tmp/states
You will receive the following response:
Found 1 items -rw-r--r-- 3 hadoop supergroup 654 2012-03-03 16...