Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Mastering Apache Storm
Mastering Apache Storm

Mastering Apache Storm: Real-time big data streaming using Kafka, Hbase and Redis

Arrow left icon
Profile Icon Jain
Arrow right icon
zł39.99 zł177.99
Full star icon Empty star icon Empty star icon Empty star icon Empty star icon 1 (1 Ratings)
eBook Aug 2017 284 pages 1st Edition
eBook
zł39.99 zł177.99
Paperback
zł221.99
Subscription
Free Trial
Arrow left icon
Profile Icon Jain
Arrow right icon
zł39.99 zł177.99
Full star icon Empty star icon Empty star icon Empty star icon Empty star icon 1 (1 Ratings)
eBook Aug 2017 284 pages 1st Edition
eBook
zł39.99 zł177.99
Paperback
zł221.99
Subscription
Free Trial
eBook
zł39.99 zł177.99
Paperback
zł221.99
Subscription
Free Trial

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Mastering Apache Storm

Storm Deployment, Topology Development, and Topology Options

In this chapter, we are going to start with deployment of Storm on multiple node (three Storm and three ZooKeeper) clusters. This chapter is very important because it focuses on how we can set up the production Storm cluster and why we need the high availability of both the Storm Supervisor, Nimbus, and ZooKeeper (as Storm uses ZooKeeper for storing the metadata of the cluster, topology, and so on)?

The following are the key points that we are going to cover in this chapter:

  • Deployment of the Storm cluster
  • Program and deploy the word count example
  • Different options of the Storm UI--kill, active, inactive, and rebalance
  • Walkthrough of the Storm UI
  • Dynamic log level settings
  • Validating the Nimbus high availability

Storm prerequisites

You should have the Java JDK and ZooKeeper ensemble installed before starting the deployment of the Storm cluster.

Installing Java SDK 7

Perform the following steps to install the Java SDK 7 on your machine. You can also go with JDK 1.8:

  1. Download the Java SDK 7 RPM from Oracle's site (http://www.oracle.com/technetwork/java/javase/downloads/index.html).
  2. Install the Java jdk-7u<version>-linux-x64.rpm file on your CentOS machine using the following command:
sudo rpm -ivh jdk-7u<version>-linux-x64.rpm 
  1. Add the following environment variable in the ~/.bashrc file:
export JAVA_HOME=/usr/java/jdk<version>
  1. Add the path of the bin directory of the JDK to the PATH system environment variable...

Setting up the Storm cluster

In this chapter, we will learn how to set up a three nodes Storm cluster, of which one node will be the active master node (Nimbus) and the other two will be worker nodes (supervisors).

The following is the deployment diagram of our three node Storm cluster:

The following are the steps that need to be performed to set up a three node Storm cluster:

  1. Install and run the ZooKeeper cluster. The steps for installing ZooKeeper are mentioned in the previous section.
  2. Download the latest stable Storm release from https://storm.apache.org/downloads.html; at the time of writing, the latest version is Storm 1.0.2.

 

  1. Once you have downloaded the latest version, copy and unzip it in all three machines. Now, we will set the $STORM_HOME environment variable on each machine to make the setup easier. The $STORM_HOME environment contains the path of the Storm...

Developing the hello world example

Before starting the development, you should have Eclipse and Maven installed in your project. The sample topology explained here will cover how to create a basic Storm project, including a spout and bolt, and how to build, and execute them.

Create a Maven project by using com.stormadvance as groupId and storm-example as artifactId.

Add the following Maven dependencies to the pom.xml file:

<dependency> 
  <groupId>org.apache.storm</groupId> 
  <artifactId>storm-core</artifactId> 
  <version>1.0.2</version> 
  <scope>provided<scope> 
</dependency> 
Make sure the scope of the Storm dependency is provided, otherwise you will not be able to deploy the topology on the Storm cluster.

Add the following Maven build plugins in the pom.xml file:

<build> 
  <plugins> 
    <plugin...

The different options of the Storm topology

This section covers the following operations that a user can perform on the Storm cluster:

  • Deactivate
  • Activate
  • Rebalance
  • Kill
  • Dynamic log level settings

Deactivate

Storm supports the deactivating a topology. In the deactivated state, spouts will not emit any new tuples into the pipeline, but the processing of the already emitted tuples will continue. The following is the command to deactivate the running topology:

$> bin/storm deactivate topologyName 

Deactivate SampleStormClusterTopology using the following command:

bin/storm deactivate SampleStormClusterTopology 

The following information is displayed:

0 [main] INFO backtype.storm.thrift - Connecting to Nimbus at...

Walkthrough of the Storm UI

This section will show you how we can start the Storm UI daemon. However, before starting the Storm UI daemon, we assume that you have a running Storm cluster. The Storm cluster deployment steps are mentioned in the previous sections of this chapter. Now, go to the Storm home directory (cd $STORM_HOME) at the leader Nimbus machine and run the following command to start the Storm UI daemon:

$> cd $STORM_HOME
$> bin/storm ui &  

By default, the Storm UI starts on the 8080 port of the machine where it is started. Now, we will browse to the http://nimbus-node:8080 page to view the Storm UI, where Nimbus node is the IP address or hostname of the the Nimbus machine.

The following is a screenshot of the Storm home page:

Cluster Summary section

...

Dynamic log level settings

The dynamic log level allows us to change the log level setting of the topology on the runtime from the Storm CLI and the Storm UI.

Updating the log level from the Storm UI

Go through the following steps to update the log level from the Storm UI:

  1. Deploy SampleStormClusterTopology again on the Storm cluster if it is not running.
  2. Browse the Storm UI at http://nimbus-node:8080/.
  3. Click on the storm_example topology.

 

  1. Now click on the Change Log Level button to change the ROOT logger of the topology, as shown in the following are the screenshots:
  1. Configure the entries mentioned in the following screenshots change the ROOT logger to ERROR:
  1. If you are planning to...

Summary

In this chapter, we have covered the installation of Storm and ZooKeeper clusters, the deployment of topologies on Storm clusters, the high availability of Nimbus nodes, and topology monitoring through the Storm UI. We have also covered the different operations a user can perform on running topology. Finally, we focused on how we can change the log level of running topology.

In the next chapter, we will focus on the distribution of topologies on multiple Storm machines/nodes.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Exploit the various real-time processing functionalities offered by Apache Storm such as parallelism, data partitioning, and more
  • Integrate Storm with other Big Data technologies like Hadoop, HBase, and Apache Kafka
  • An easy-to-understand guide to effortlessly create distributed applications with Storm

Description

Apache Storm is a real-time Big Data processing framework that processes large amounts of data reliably, guaranteeing that every message will be processed. Storm allows you to scale your data as it grows, making it an excellent platform to solve your big data problems. This extensive guide will help you understand right from the basics to the advanced topics of Storm. The book begins with a detailed introduction to real-time processing and where Storm fits in to solve these problems. You’ll get an understanding of deploying Storm on clusters by writing a basic Storm Hello World example. Next we’ll introduce you to Trident and you’ll get a clear understanding of how you can develop and deploy a trident topology. We cover topics such as monitoring, Storm Parallelism, scheduler and log processing, in a very easy to understand manner. You will also learn how to integrate Storm with other well-known Big Data technologies such as HBase, Redis, Kafka, and Hadoop to realize the full potential of Storm. With real-world examples and clear explanations, this book will ensure you will have a thorough mastery of Apache Storm. You will be able to use this knowledge to develop efficient, distributed real-time applications to cater to your business needs.

Who is this book for?

If you are a Java developer who wants to enter into the world of real-time stream processing applications using Apache Storm, then this book is for you. No previous experience in Storm is required as this book starts from the basics. After finishing this book, you will be able to develop not-so-complex Storm applications.

What you will learn

  • Understand the core concepts of Apache Storm and real-time processing
  • Follow the steps to deploy multiple nodes of Storm Cluster
  • Create Trident topologies to support various message-processing semantics
  • Make your cluster sharing effective using Storm scheduling
  • Integrate Apache Storm with other Big Data technologies such as Hadoop, HBase, Kafka, and more
  • Monitor the health of your Storm cluster

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Aug 16, 2017
Length: 284 pages
Edition : 1st
Language : English
ISBN-13 : 9781787120402
Vendor :
Apache
Category :
Languages :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Aug 16, 2017
Length: 284 pages
Edition : 1st
Language : English
ISBN-13 : 9781787120402
Vendor :
Apache
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just zł20 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just zł20 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 577.97
Mastering Apache Storm
zł221.99
Building Data Streaming Applications with Apache Kafka
zł197.99
Apache Kafka 1.0 Cookbook
zł157.99
Total 577.97 Stars icon
Banner background image

Table of Contents

12 Chapters
Real-Time Processing and Storm Introduction Chevron down icon Chevron up icon
Storm Deployment, Topology Development, and Topology Options Chevron down icon Chevron up icon
Storm Parallelism and Data Partitioning Chevron down icon Chevron up icon
Trident Introduction Chevron down icon Chevron up icon
Trident Topology and Uses Chevron down icon Chevron up icon
Storm Scheduler Chevron down icon Chevron up icon
Monitoring of Storm Cluster Chevron down icon Chevron up icon
Integration of Storm and Kafka Chevron down icon Chevron up icon
Storm and Hadoop Integration Chevron down icon Chevron up icon
Storm Integration with Redis, Elasticsearch, and HBase Chevron down icon Chevron up icon
Apache Log Processing with Storm Chevron down icon Chevron up icon
Twitter Tweet Collection and Machine Learning Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Empty star icon Empty star icon Empty star icon Empty star icon 1
(1 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 100%
Raghav Alagh Feb 16, 2024
Full star icon Empty star icon Empty star icon Empty star icon Empty star icon 1
Worst Book. Nothing explained clearly, just waste of money
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.