Design your first Hadoop dashboard
Download any sample big data files or extract logs from systems using Flume and so on. For the purpose of the book we will be downloading the dataset from the following URL:
http://www.seanlahman.com/?s=lahman591-csv.zip
Extract the ZIP file.
Upload the data file to HDFS by following these steps:
- Navigate to the HDFS files directory from the Hortonworks web interface.
- Navigate to
/usr/maria_dev
and click on the Upload button. - Click on the Browse button, navigate to the location where we extracted the downloaded ZIP file, and select the
batting.csv
file. - Now, open a Hive view by clicking on the Hive View button.
- In this view, create a table to hold the data by executing the following command:
create table intermediate_batting (col_value STRING);
- Upon execution of the query, we can view the
intermediate_batting
table under default databases. - Execute the following command to load the
batting.csv
data file into theintermediate_batting
table:Â Â ...