You're reading from Elasticsearch 7.0 Cookbook Over 100 recipes for fast, scalable, and reliable search for your enterprise

Product type Paperback

Published in Apr 2019

Publisher Packt

ISBN-13 9781789956504

Length 724 pages

Edition 4th Edition

Languages

Java

Tools

Elasticsearch

Concepts

Enterprise Search

Author (1):

Alberto Paro

View More author details

Setting up Linux systems

If you are using a Linux system (generally in a production environment), you need to manage extra setup to improve performance or to resolve production problems with many indices.

This recipe covers the following two common errors that happen in production:

Too many open files that can corrupt your indices and your data
Slow performance in search and indexing due to the garbage collector

Big problems arise when you run out of disk space. In this scenario, some files can get corrupted. To prevent your indices from corruption and possible data, it is best to monitor the storage spaces. Default settings prevent index writing and block the cluster if your storage is over 80% full.

Getting ready

As we described in the Downloading and installing Elasticsearch recipe in this chapter, you need a working Elasticsearch installation and a simple text editor to change configuration files.

How to do it…

To improve the performance on Linux systems, we will perform the following steps:

First, you need to change the current limit for the user that runs the Elasticsearch server. In these examples, we will call this elasticsearch.
To allow Elasticsearch to manage a large number of files, you need to increment the number of file descriptors (number of files) that a user can manage. To do so, you must edit your /etc/security/limits.conf file and add the following lines at the end:

elasticsearch - nofile 65536
elasticsearch - memlock unlimited

Then, a machine restart is required to be sure that the changes have been made.
The new version of Ubuntu (that is, version 16.04 or later) can skip the /etc/security/limits.conf file in the init.d scripts. In these cases, you need to edit /etc/pam.d/ and remove the following comment line:

# session required pam_limits.so

To control memory swapping, you need to set up the following parameter in elasticsearch.yml:

bootstrap.memory_lock

To fix the memory usage size of the Elasticsearch server, we need to set up the same values for Xmsand Xmx in $ES_HOME/config/jvm.options (that is, we set 1 GB of memory in this case), as follows:

-Xms1g
-Xmx1g

How it works…

The standard limit of file descriptors (https://www.bottomupcs.com/file_descriptors.xhtml ) (maximum number of open files for a user) is typically 1,024 or 8,096. When you store a lot of records in several indices, you run out of file descriptors very quickly, so your Elasticsearch server becomes unresponsive and your indices may become corrupted, causing you to lose your data.

Changing the limit to a very high number means that your Elasticsearch doesn't hit the maximum number of open files.

The other setting for memory prevents Elasticsearch from swapping memory and give a performance boost in a environment. This setting is required because, during indexing and searching, Elasticsearch creates and destroys a lot of objects in memory. This large number of create/destroy actions fragments the memory and reduces performance. The memory then becomes full of holes and, when the system needs to allocate more memory, it suffers an overhead to find compacted memory. If you don't set bootstrap.memory_lock: true, Elasticsearch dumps the whole process memory on disk and defragments it back in memory, freezing the system. With this setting, the defragmentation step is done all in memory, with a huge performance boost.