Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Learning ELK Stack
Learning ELK Stack

Learning ELK Stack: Build mesmerizing visualizations, analytics, and logs from your data using Elasticsearch, Logstash, and Kibana

Arrow left icon
Profile Icon Chhajed
Arrow right icon
S$12.99 S$52.99
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.2 (5 Ratings)
eBook Nov 2015 206 pages 1st Edition
eBook
S$12.99 S$52.99
Paperback
S$66.99
Subscription
Free Trial
Arrow left icon
Profile Icon Chhajed
Arrow right icon
S$12.99 S$52.99
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.2 (5 Ratings)
eBook Nov 2015 206 pages 1st Edition
eBook
S$12.99 S$52.99
Paperback
S$66.99
Subscription
Free Trial
eBook
S$12.99 S$52.99
Paperback
S$66.99
Subscription
Free Trial

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Learning ELK Stack

Chapter 1. Introduction to ELK Stack

This chapter explains the importance of log analysis in today's data-driven world and what are the challenges associated with log analysis. It introduces ELK stack as a complete log analysis solution, and explains what ELK stack is and the role of each of the open source components of the stack, namely, Elasticsearch, Logstash, and Kibana. Also, it briefly explains the key features of each of the components and describes the installation and configuration steps for them.

The need for log analysis

Logs provide us with necessary information on how our system is behaving. However, the content and format of the logs varies among different services or say, among different components of the same system. For example, a scanner may log error messages related to communication with other devices; on the other hand, a web server logs information on all incoming requests, outgoing responses, time taken for a response, and so on. Similarly, application logs for an e-commerce website will log business-specific logs.

As the logs vary by their content, so will their uses. For example, the logs from a scanner may be used for troubleshooting or for a simple status check or reporting while the web server log is used to analyze traffic patterns across multiple products. Analysis of logs from an e-commerce site can help figure out whether packages from a specific location are returned repeatedly and the probable reasons for the same.

The following are some common use cases where log analysis is helpful:

  • Issue debugging
  • Performance analysis
  • Security analysis
  • Predictive analysis
  • Internet of things (IoT) and logging

Issue debugging

Debugging is one of the most common reasons to enable logging within your application. The simplest and most frequent use for a debug log is to grep for a specific error message or event occurrence. If a system administrator believes that a program crashed because of a network failure, then he or she will try to find a connection dropped message or a similar message in the server logs to analyze what caused the issue. Once the bug or the issue is identified, log analysis solutions help capture application information and snapshots of that particular time can be easily passed across development teams to analyze it further.

Performance analysis

Log analysis helps optimize or debug system performance and give essential inputs around bottlenecks in the system. Understanding a system's performance is often about understanding resource usage in the system. Logs can help analyze individual resource usage in the system, behavior of multiple threads in the application, potential deadlock conditions, and so on. Logs also carry with them timestamp information, which is essential to analyze how the system is behaving over time. For instance, a web server log can help know how individual services are performing based on response times, HTTP response codes, and so on.

Security analysis

Logs play a vital role in managing the application security for any organization. They are particularly helpful to detect security breaches, application misuse, malicious attacks, and so on. When users interact with the system, it generates log events, which can help track user behavior, identify suspicious activities, and raise alarms or security incidents for breaches.

The intrusion detection process involves session reconstruction from the logs itself. For example, ssh login events in the system can be used to identify any breaches on the machines.

Predictive analysis

Predictive analysis is one of the hot trends of recent times. Logs and events data can be used for very accurate predictive analysis. Predictive analysis models help in identifying potential customers, resource planning, inventory management and optimization, workload efficiency, and efficient resource scheduling. It also helps guide the marketing strategy, user-segment targeting, ad-placement strategy, and so on.

Internet of things and logging

When it comes to IoT devices (devices or machines that interact with each other without any human intervention), it is vital that the system is monitored and managed to keep downtime to a minimum and resolve any important bugs or issues swiftly. Since these devices should be able to work with little human intervention and may exist on a large geographical scale, log data is expected to play a crucial role in understanding system behavior and reducing downtime.

Challenges in log analysis

The current log analysis process mostly involves checking logs at multiple servers that are written by different components and systems across your application. This has various problems, which makes it a time-consuming and tedious job. Let's look at some of the common problem scenarios:

  • Non-consistent log format
  • Decentralized logs
  • Expert knowledge requirement

Non-consistent log format

Every application and device logs in its own special way, so each format needs its own expert. Also, it is difficult to search across because of different formats.

Let's take a look at some of the common log formats. An interesting thing to observe will be the way different logs represent different timestamp formats, different ways to represent INFO, ERROR, and so on, and the order of these components with logs. It's difficult to figure out just by seeing logs what is present at what location. This is where tools such as Logstash help.

Tomcat logs

A typical tomcat server startup log entry will look like this:

May 24, 2015 3:56:26 PM org.apache.catalina.startup.HostConfig deployWAR
INFO: Deployment of web application archive \soft\apache-tomcat-7.0.62\webapps\sample.war has finished in 253 ms

Apache access logs – combined log format

A typical Apache access log entry will look like this:

127.0.0.1 - - [24/May/2015:15:54:59 +0530] "GET /favicon.ico HTTP/1.1" 200 21630

IIS logs

A typical IIS log entry will look like this:

2012-05-02 17:42:15 172.24.255.255 - 172.20.255.255 80 GET /images/favicon.ico - 200 Mozilla/4.0+(compatible;MSIE+5.5;+Windows+2000+Server)

Variety of time formats

Not only log formats, but timestamp formats are also different among different types of applications, different types of events generated across multiple devices, and so on. Different types of time formats across different components of your system also make it difficult to correlate events occurring across multiple systems at the same time:

  • 142920788
  • Oct 12 23:21:45
  • [5/May/2015:08:09:10 +0000]
  • Tue 01-01-2009 6:00
  • 2015-05-30 T 05:45 UTC
  • Sat Jul 23 02:16:57 2014
  • 07:38, 11 December 2012 (UTC)

Decentralized logs

Logs are mostly spread across all the applications that may be across different servers and different components. The complexity of log analysis increases with multiple components logging at multiple locations. For one or two servers' setup, finding out some information from logs involves running cat or tail commands or piping these results to grep command. But what if you have 10, 20, or say, 100 servers? These kinds of searches are mostly not scalable for a huge cluster of machines and need a centralized log management and an analysis solution.

Expert knowledge requirement

People interested in getting the required business-centric information out of logs generally don't have access to the logs or may not have the technical expertise to figure out the appropriate information in the quickest possible way, which can make analysis slower, and sometimes, impossible too.

The ELK Stack

The ELK platform is a complete log analytics solution, built on a combination of three open source tools—Elasticsearch, Logstash, and Kibana. It tries to address all the problems and challenges that we saw in the previous section. ELK utilizes the open source stack of Elasticsearch for deep search and data analytics; Logstash for centralized logging management, which includes shipping and forwarding the logs from multiple servers, log enrichment, and parsing; and finally, Kibana for powerful and beautiful data visualizations. ELK stack is currently maintained and actively supported by the company called Elastic (formerly, Elasticsearch).

Let's look at a brief overview of each of these systems:

  • Elasticsearch
  • Logstash
  • Kibana

Elasticsearch

Elasticsearch is a distributed open source search engine based on Apache Lucene, and released under an Apache 2.0 license (which means that it can be downloaded, used, and modified free of charge). It provides horizontal scalability, reliability, and multitenant capability for real-time search. Elasticsearch features are available through JSON over a RESTful API. The searching capabilities are backed by a schema-less Apache Lucene Engine, which allows it to dynamically index data without knowing the structure beforehand. Elasticsearch is able to achieve fast search responses because it uses indexing to search over the texts.

Elasticsearch is used by many big companies, such as GitHub, SoundCloud, FourSquare, Netflix, and many others. Some of the use cases are as follows:

  • Wikipedia: This uses Elasticsearch to provide a full text search, and provide functionalities, such as search-as-you-type, and did-you-mean suggestions.
  • The Guardian: This uses Elasticsearch to process 40 million documents per day, provide real-time analytics of site-traffic across the organization, and help understand audience engagement better.
  • StumbleUpon: This uses Elasticsearch to power intelligent searches across its platform and provide great recommendations to millions of customers.
  • SoundCloud: This uses Elasticsearch to provide real-time search capabilities for millions of users across geographies.
  • GitHub: This uses Elasticsearch to index over 8 million code repositories, and index multiple events across the platform, hence providing real-time search capabilities across it.

Some of the key features of Elasticsearch are:

  • It is an open source distributed, scalable, and highly available real-time document store
  • It provides real-time search and analysis capabilities
  • It provides a sophisticated RESTful API to play around with lookup, and various features, such as multilingual search, geolocation, autocomplete, contextual did-you-mean suggestions, and result snippets
  • It can be scaled horizontally easily and provides easy integrations with cloud-based infrastructures, such as AWS and others

Logstash

Logstash is a data pipeline that helps collect, parse, and analyze a large variety of structured and unstructured data and events generated across various systems. It provides plugins to connect to various types of input sources and platforms, and is designed to efficiently process logs, events, and unstructured data sources for distribution into a variety of outputs with the use of its output plugins, namely file, stdout (as output on console running Logstash), or Elasticsearch.

It has the following key features:

  • Centralized data processing: Logstash helps build a data pipeline that can centralize data processing. With the use of a variety of plugins for input and output, it can convert a lot of different input sources to a single common format.
  • Support for custom log formats: Logs written by different applications often have particular formats specific to the application. Logstash helps parse and process custom formats on a large scale. It provides support to write your own filters for tokenization and also provides ready-to-use filters.
  • Plugin development: Custom plugins can be developed and published, and there is a large variety of custom developed plugins already available.

Kibana

Kibana is an open source Apache 2.0 licensed data visualization platform that helps in visualizing any kind of structured and unstructured data stored in Elasticsearch indexes. Kibana is entirely written in HTML and JavaScript. It uses the powerful search and indexing capabilities of Elasticsearch exposed through its RESTful API to display powerful graphics for the end users. From basic business intelligence to real-time debugging, Kibana plays its role through exposing data through beautiful histograms, geomaps, pie charts, graphs, tables, and so on.

Kibana makes it easy to understand large volumes of data. Its simple browser-based interface enables you to quickly create and share dynamic dashboards that display changes to Elasticsearch queries in real time.

Some of the key features of Kibana are as follows:

  • It provides flexible analytics and a visualization platform for business intelligence.
  • It provides real-time analysis, summarization, charting, and debugging capabilities.
  • It provides an intuitive and user friendly interface, which is highly customizable through some drag and drop features and alignments as and when needed.
  • It allows saving the dashboard, and managing more than one dashboard. Dashboards can be easily shared and embedded within different systems.
  • It allows sharing snapshots of logs that you have already searched through, and isolates multiple problem transactions.

ELK data pipeline

A typical ELK stack data pipeline looks something like this:

ELK data pipeline

In a typical ELK Stack data pipeline, logs from multiple application servers are shipped through Logstash shipper to a centralized Logstash indexer. The Logstash indexer will output data to an Elasticsearch cluster, which will be queried by Kibana to display great visualizations and build dashboards over the log data.

Left arrow icon Right arrow icon

Description

The ELK stack—Elasticsearch, Logstash, and Kibana, is a powerful combination of open source tools. Elasticsearch is for deep search and data analytics. Logstash is for centralized logging, log enrichment, and parsing. Kibana is for powerful and beautiful data visualizations. In short, the Elasticsearch ELK stack makes searching and analyzing data easier than ever before. This book will introduce you to the ELK (Elasticsearch, Logstash, and Kibana) stack, starting by showing you how to set up the stack by installing the tools, and basic configuration. You’ll move on to building a basic data pipeline using the ELK stack. Next, you’ll explore the key features of Logstash and its role in the ELK stack, including creating Logstash plugins, which will enable you to use your own customized plugins. The importance of Elasticsearch and Kibana in the ELK stack is also covered, along with various types of advanced data analysis, and a variety of charts, tables ,and maps. Finally, by the end of the book you will be able to develop full-fledged data pipeline using the ELK stack and have a solid understanding of the role of each of the components.

Who is this book for?

If you are a developer or DevOps engineer interested in building a system that provides amazing insights and business metrics out of data sources, of various formats and types, using the open source technology stack that ELK provides, then this book is for you. Basic knowledge of Unix or any programming language will be helpful to make the most out of this book.

What you will learn

  • Install, configure, and run Elasticsearch, Logstash, and Kibana
  • Understand the need for log analytics and the current challenges in log analysis
  • Build your own data pipeline using the ELK stack
  • Familiarize yourself with the key features of Logstash and the variety of input, filter, and output plugins it provides
  • Build your own custom Logstash plugin
  • Create actionable insights using charts, histograms, and quick search features in Kibana4
  • Understand the role of Elasticsearch in the ELK stack

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Nov 26, 2015
Length: 206 pages
Edition : 1st
Language : English
ISBN-13 : 9781785886706
Vendor :
Elastic
Category :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Nov 26, 2015
Length: 206 pages
Edition : 1st
Language : English
ISBN-13 : 9781785886706
Vendor :
Elastic
Category :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just S$6 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just S$6 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total S$ 119.98
Learning ELK Stack
S$66.99
Kibana Essentials
S$52.99
Total S$ 119.98 Stars icon
Banner background image

Table of Contents

11 Chapters
1. Introduction to ELK Stack Chevron down icon Chevron up icon
2. Building Your First Data Pipeline with ELK Chevron down icon Chevron up icon
3. Collect, Parse and Transform Data with Logstash Chevron down icon Chevron up icon
4. Creating Custom Logstash Plugins Chevron down icon Chevron up icon
5. Why Do We Need Elasticsearch in ELK? Chevron down icon Chevron up icon
6. Finding Insights with Kibana Chevron down icon Chevron up icon
7. Kibana – Visualization and Dashboard Chevron down icon Chevron up icon
8. Putting It All Together Chevron down icon Chevron up icon
9. ELK Stack in Production Chevron down icon Chevron up icon
10. Expanding Horizons with ELK Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.2
(5 Ratings)
5 star 20%
4 star 20%
3 star 40%
2 star 0%
1 star 20%
Akshay Aug 24, 2017
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Great to begin ELK
Amazon Verified review Amazon
Amazon Customer Jan 25, 2016
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
Just over a month ago, I purchased a copy of Learning ELK stack and is a good book for those wanting a quick ramp up of what ELK is, This is a good step-by-step guide for Elastic Search, Logstash and Kibana. this book its well written and gives an honest account, i would highly recommend this as a first book read to anyone wanting to know about ELK.
Amazon Verified review Amazon
Kindle Customer Jun 07, 2017
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
Great quickstart , got me up and going very quickly. It does however lack depth so if you need a reference this would have to be in addition to the elk documentation.
Amazon Verified review Amazon
Mark Grover Jan 09, 2016
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
A very good book, let down by inaccurate examples of unix command syntax which quite simply doesnt work. I am new to ELK and followed the example scenario presented in the book but had to google and research some of the command syntax as the printed examples returned errors. Its such a shame as the author presents a very thorough step by step guide to configuring the ELK components. I did manage to get everything working and would still recommend the book to newcomers despite the Unix command issues.
Amazon Verified review Amazon
Jascha Casadio Jul 02, 2016
Full star icon Empty star icon Empty star icon Empty star icon Empty star icon 1
The Elasticsearch, Logstash and Kibana trinity, usually referred to as the ELK stack, is by far, the de facto standard in log centralization and analysis. Despite being such a popular solution, with some half a million downloads per month, the titles available to the stack or specific to its components are still very limited. This is partially compensated by the official documentation, which is both friendly and easy to follow and allows anyone to quickly get started. Learning ELK Stack is the only title available, until now, that covers the three products at once. It targets beginners who are interested in an overall view of the stack and its components.Released at the end of 2015, Learning ELK Stack is a short book spanning around 200 pages. As any typical beginner's title, it does start with an introductory chapter that gets the reader through the installation process. What follows is a series of chapters where the author first shows the power of the stack and then dives deeper into its components.The very first chapter already reveals problems that are present throughout the whole book. It is very superficial and does not really get a beginner started. First of all, the stack does require Oracle's JDK installed and configured. This is completely overlooked. Now while someone might argue that it is pretty straightforward through a package manager, the reader can still be using a distro whose package manager installs a version of the JDK that is not suitable for the ELK stack. Likewise, the whole stack can be installed through simple apt commands. The author does cover installation through .tar.gz archives, completley overlooking installing and configuring Java from the source and the default Java (yea, you can have multiple at the same time). Which is not that straightforward.Installation apart, nothing is said about the configuration of the three softwares neither as a standalone nor as a stack. Well, this is not properly correct. The astonishing amount of twelve lines is dedicated to this in fact. After an overall overview of the stack, with an example built using data from Yahoo—you ain't limited to use the stack to process logs, the authors focuses on the components, each at a time. These chapters feel like a reference. Each option is listed, but the examples do not go beyond the two lines. Interestingly, by googling sentences from the book, we find a 70% match analysis with the official documentation (Harvard refers to this as mosaic plagiarism), suggesting a copy/paste with a couple of words added/removed.As an example, this is what we find in Learning ELK stack:The mutate filter is an important filter plugin that helps rename, remove, replace, and modify fields in an incoming event.And this is the information we freely find in the official documentation provided by Elastic:The mutate filter allows you to perform general mutations on fields. You can rename, remove, replace, and modify fields in your events.Tying it all up, I do not really recommend this book ,despite being the only title covering the stack as a whole. The documentation the reader will find is first of all inaccurate. Anything else can be found for free in the official documentation.I have tried to contact Packt Publishing asking what is their position on that matter. I did not get any real answer apart from a semi-automatic one.As usual, you can find more reviews on my personal blog: books.lostinmalloc.com. Feel free to pass by and share your thoughts!
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.