Inserting data from Kafka to SolrCloud
Solr is a highly-available, fault-tolerant environment for distributing indexed content and query requests across multiple servers. It is not possible to insert data into Solr directly; a tool like Flume is needed.
Getting ready
For this recipe a Kafka cluster must be up and running.
To install Solr follow the instructions on this page:Â https://lucene.apache.org/solr/guide/6_6/installing-solr.html.
The installation of Apache Flume is also required, follow the instructions on this page:Â https://flume.apache.org/download.html.
How to do it...
- Create a Flume configuration file called
flume.conf
with this content:
flume1.sources = kafka-source-1 flume1.channels = mem-channel-1 flume1.sinks = solr-sink-1 flume1.sources.kafka-source-1.type=org.apache.flume.source.kafka.KafkaSource flume1.sources.kafka-source-1.zookeeperConnect = localhost:2181 flume1.sources.kafka-source-1.topic = source-topic flume1.sources.kafka-source-1.batchSize = 100 flume1.sources...