Time for action – exporting query output
We have previously either loaded large quantities of data into Hive or extracted very small quantities as query results. We can also export large result sets; let us look at an example.
Recreate the previously used view:
$ hive -f view.hql
Create the following file as
export.hql
:INSERT OVERWRITE DIRECTORY '/tmp/out' SELECT reported, shape, state FROM usa_sightings WHERE state = 'California' ;
Execute the script:
$ hive -f export.hql
You will receive the following response:
2012-03-04 06:20:44,571 Stage-1 map = 100%, reduce = 100% Ended Job = job_201203040432_0029 Moving data to: /tmp/out 7599 Rows loaded to /tmp/out MapReduce Jobs Launched: Job 0: Map: 2 Reduce: 1 HDFS Read: 75416863 HDFS Write: 210901 SUCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 46.669 seconds
Look in the specified output directory:
$ hadoop fs -ls /tmp/out
You will receive the following response:
Found 1 items -rw-r--r-- 3 hadoop supergroup 210901 … /tmp...