Indexing data via Apache Spark
After having installed Apache Spark, we can configure it to work with Elasticsearch and write some data in it.
Getting ready
You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.
You also need a working installation of Apache Spark.
How to do it...
To configure Apache Spark to communicate with Elasticsearch, we will perform the following steps:
We need to download the ElasticSearch Spark JAR:
wget http://download.elastic.co/hadoop/elasticsearch-hadoop- 5.1.1.zip unzip elasticsearch-hadoop-5.1.1.zip
A quick way to access the Spark shell in Elasticsearch is to copy the Elasticsearch Hadoop required file in Spark's
.jar
directory. The file that must be copied iselasticsearch-spark-20_2.11-5.1.1.jar
.The version of Scala used by both Apache Spark and Elasticsearch Spark must match!
For storing data in Elasticsearch via Apache...