Packt+ | Advance your knowledge in tech

You're reading from Mastering Apache Storm Real-time big data streaming using Kafka, Hbase and Redis

Product type Paperback

Published in Aug 2017

Publisher

ISBN-13 9781787125636

Length 284 pages

Edition 1st Edition

Languages

Java

Tools

Apache Storm

Concepts

Big Data

Author (1):

Ankit Jain

View More author details

Chapter 1, Real-Time Processing and Storm Introduction, gives an introduction to Storm and its components.

Chapter 2, Storm Deployment, Topology Development, and Topology Options, covers deploying Storm into the cluster, deploying the sample topology on a Storm cluster, how we can monitor the storm pipeline using storm UI, and how we can dynamically change the log level settings.

Chapter 3, Storm Parallelism and Data Partitioning, covers the parallelism of topology, how to configure parallelism at the code level, guaranteed message processing, and Storm internally generated tuples.

Chapter 4, Trident Introduction, covers an introduction to Trident, an understanding of the Trident data model, and how we can write Trident filters and functions. This chapter also covers repartitioning and aggregation operations on Trident tuples.

Chapter 5, Trident Topology and Uses, introduces Trident tuple grouping, non-transactional topology, and a sample Trident topology. The chapter also introduces Trident state and distributed RPC.

Chapter 6, Storm Scheduler, covers different types of scheduler available in Storm: the default scheduler, isolation scheduler, resource-aware scheduler, and custom scheduler.

Chapter 7, Monitoring of the Storm Cluster, covers monitoring Storm by writing custom monitoring UIs using the stats published by Nimbus. We explain the integration of Ganglia with Storm using JMXTrans. This chapter also covers how we can configure Storm to publish JMX metrics.

Chapter 8, Integration of Storm and Kafka, shows the integration of Storm with Kafka. This chapter starts with an introduction to Kafka, covers the installation of Storm, and ends with the integration of Storm with Kafka to solve any real-world problem.

Chapter 9, Storm and Hadoop Integration, covers an overview of Hadoop, writing the Storm topology to publish data into HDFS, an overview of Storm-YARN, and deploying the Storm topology on YARN.

Chapter 10, Storm Integration with Redis, Elasticsearch, and HBase, teaches you how to integrate Storm with various other big data technologies.

Chapter 11, Apache Log Processing with Storm, covers a sample log processing application in which we parse Apache web server logs and generate some business information from log files.

Chapter 12, Twitter Tweets Collection and Machine Learning, walks you through a case study implementing a machine learning topology in Storm.