Creating Twitter trending topics using Spark Streaming
Spark supports various modules. In this recipe, we are going to take a look at its SQL module, which allows the execution of SQL queries through a Spark application. We are going to explore how to access Hive from Spark and perform analytics.
Getting ready
To perform this recipe, you should have Hadoop and Spark installed. You also need to install Scala. I am using Scala 2.11.0. You should also have Hive installed.
How to do it...
To use Hive from Spark, we are going to write one sample spark application in Scala. You can choose an IDE of your choice. Since we are going to write the application in Scala, you will need Scala and SBT installed on your machine.
First of all, I am going to create a folder called HiveFromSpark
, and add the following files to it:
HiveFromSpark\src\main\scala\com\demo\HiveFromSpark.scala HiveFromSpark\ project\assembly.sbt HiveFromSpark\build.sbt HiveFromSpark\src\main\resources\emp.txt
Then, we set build.sbt
to...