You're reading from Mastering Kibana 6.x Visualize your Elastic Stack data with histograms, maps, charts, and graphs

Product type Paperback

Published in Jul 2018

Publisher Packt

ISBN-13 9781788831031

Length 376 pages

Edition 1st Edition

Languages

Java

Tools

Kibana

Concepts

Data Visualization

Author (1):

Anurag Srivastava

View More author details

What is ELK Stack?

ELK Stack is a stack with three different open source software—Elasticsearch, Logstash, and Kibana. Elasticsearch is a search engine that is developed on top of Apache Lucene. Logstash is basically used for data pipelining where we can get data from any data source as an input, transform it if required, and send it to any destination as an output. In general, we use Logstash to push the data into Elasticsearch. Kibana is a dashboard or visualization tool, which can be configured with Elasticsearch to generate charts, graphs, and dashboards using our data:

We can use ELK Stack for different use cases, the most common being log analysis. Other than that, we can use it for business intelligence, application security and compliance, web analytics, fraud management, and so on.

In the following subsections, we are going to be looking at ELK Stack's components.

Elasticsearch

Elasticsearch is a full text search engine that can be used as a NoSQL database and as an analytics engine. It is easy to scale, schemaless, and near real time, and provides a restful interface for different operations. It is schemaless, and it uses inverted indexes for data storage. There are different language clients available for Elasticsearch, as follows:

Java
PHP
Perl
Python
.NET
Ruby
JavaScript
Groovy

The basic components of Elasticsearch are as follows:

Cluster
Node
Index
Type
Document
Shard

Logstash

Logstash is basically used for data pipelining, through which we can take input from different sources and output to different data sources. Using Logstash, we can clean the data through filter options and mutate the input data before sending it to the output source. Logstash has different adapters to handle different applications, such as for MySQL or any other relational database connection. We have a JDBC input plugin through which we can connect to MySQL server, run queries, and take the table data as the input in Logstash. For Elasticsearch, there is a connector in Logstash that gives us the option to seamlessly transfer data from Logstash to Elasticsearch.

To run Logstash, we need to install Logstash and edit the configuration file logstash.conf, which consists of an input, output, and filter sections. We need to tell Logstash where it should get the input from through the input block, what it should do with the input through the filter block, and where it should send the output through the output block. In the following example, I am reading an Apache Access Log and sending the output to Elasticsearch:

input {
     file {
         path => "/var/log/apache2/access.log"
         }
 }
 
 filter {
     grok {
         match => { message => "%{COMBINEDAPACHELOG}" }
         }
 }
 
output {
   elasticsearch {
        hosts => "http://127.0.0.1:9200"
        index => "logs_apache"
        document_type => "logs"
        }
}

The input block is showing a file key that is set to /var/log/apache2/access.log. This means that we are getting the file input and path of the file, /var/log/apache2/access.log, which is Apache's log file. The filter block is showing the grok filter, which converts unstructured data into structured data by parsing it.

There are different patterns that we can apply for the Logstash filter. Here, we are parsing the Apache logs, but we can filter different things, such as email, IP addresses, and dates.

Kibana

Kibana is a dashboarding open source software from ELK Stack, and it is a very good tool for creating different visualizations, charts, maps, and histograms, and by integrating different visualizations together, we can create dashboards. It is part of ELK Stack; hence it is quite easy to read the Elasticsearch data. This does not require any programming skills. Kibana has a beautiful UI for creating different types of visualizations, including charts, histograms, and dashboards.

It provides us with different inbuilt dashboards with multiple visualizations when we use Beats, as it automatically creates multiple visualizations that we can customize to create a useful dashboard, such as for CPU usage and memory usage.

Beats

Beats are basically data shippers that are grouped to do single-purpose jobs. For example, Metricbeat is used to collect metrics for memory usage, CPU usage, and disk space, whereas Filebeat is used to send file data such as logs. They can be installed as agents on different servers to send data from different sources to a central Logstash or Elasticsearch cluster. They are written in Go; they work on a cross-platform environment; and they are lightweight in design. Before Beats, it was very difficult to get data from different machines as there was no single-purpose data shipper, and we had to do some tweaking to get the desired data from servers.

For example, if I am running a web application on the Apache web server and want to run it smoothly, then there are two things that need to be monitored—first, all of the errors from the application, and second, the server's performance, such as memory usage, CPU usage, and disk space. So, in order to collect this information, we need to install the following two Beats on our machine:

Filebeat: This is used to collect log data from Apache web server in an incremental way. Filebeat will run on the server and will periodically check for any change in the Apache log. When there is any change in the Apache log file, it will send the log to Logstash. Logstash will receive the data file and execute the filter to find the errors. After filtering the data, it saves the data into Elasticsearch.

Metricbeat: This is used to collect metrics for memory usage, CPU usage, disk space, and so on. Metricbeat collects the server metrics, such as memory usage, CPU usage, and disk space, and saves the data into Elasticsearch. Metrics data sends a predefined set of data, and there is no need to modify anything; that is why it sends data directly to Elasticsearch instead of sending it to Logstash first.

To visualize this data, we can use Kibana to create meaningful dashboards through which we can get complete control of our data.