There are multiple tools and frameworks available on the market for data ingestion. We will discuss the following in the scope of this book:
- Apache Kafka
- Apache NiFi
- Logstash
- Fluentd
- Apache Flume
There are multiple tools and frameworks available on the market for data ingestion. We will discuss the following in the scope of this book:
Kafka is message broker which can be connected to any real-time framework available on the market. In this book, we will use Kafka often for all types of examples. We will use Kafka as a data source which keeps data from files in queues for further processing. Download Kafka from https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.1.1/kafka_2.11-0.10.1.1.tgz to your local machine. Once the kafka_2.11-0.10.1.1.tgz file is downloaded, extract the files using the following...