Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Elasticsearch 7.0 Cookbook

You're reading from   Elasticsearch 7.0 Cookbook Over 100 recipes for fast, scalable, and reliable search for your enterprise

Arrow left icon
Product type Paperback
Published in Apr 2019
Publisher Packt
ISBN-13 9781789956504
Length 724 pages
Edition 4th Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Alberto Paro Alberto Paro
Author Profile Icon Alberto Paro
Alberto Paro
Arrow right icon
View More author details
Toc

Table of Contents (19) Chapters Close

Preface 1. Getting Started FREE CHAPTER 2. Managing Mapping 3. Basic Operations 4. Exploring Search Capabilities 5. Text and Numeric Queries 6. Relationship and Geo Queries 7. Aggregations 8. Scripting in Elasticsearch 9. Managing Clusters 10. Backups and Restoring Data 11. User Interfaces 12. Using the Ingest Module 13. Java Integration 14. Scala Integration 15. Python Integration 16. Plugin Development 17. Big Data Integration 18. Another Book You May Enjoy

Setting up Linux systems

If you are using a Linux system (generally in a production environment), you need to manage extra setup to improve performance or to resolve production problems with many indices.

This recipe covers the following two common errors that happen in production:

  • Too many open files that can corrupt your indices and your data
  • Slow performance in search and indexing due to the garbage collector
Big problems arise when you run out of disk space. In this scenario, some files can get corrupted. To prevent your indices from corruption and possible data, it is best to monitor the storage spaces. Default settings prevent index writing and block the cluster if your storage is over 80% full.

Getting ready

As we described in the Downloading and installing Elasticsearch recipe in this chapter, you need a working Elasticsearch installation and a simple text editor to change configuration files.

How to do it…

To improve the performance on Linux systems, we will perform the following steps:

  1. First, you need to change the current limit for the user that runs the Elasticsearch server. In these examples, we will call this elasticsearch.
  2. To allow Elasticsearch to manage a large number of files, you need to increment the number of file descriptors (number of files) that a user can manage. To do so, you must edit your /etc/security/limits.conf file and add the following lines at the end:
elasticsearch - nofile 65536
elasticsearch - memlock unlimited
  1. Then, a machine restart is required to be sure that the changes have been made.
  2. The new version of Ubuntu (that is, version 16.04 or later) can skip the /etc/security/limits.conf file in the init.d scripts. In these cases, you need to edit /etc/pam.d/ and remove the following comment line:
# session required pam_limits.so
  1. To control memory swapping, you need to set up the following parameter in elasticsearch.yml:
bootstrap.memory_lock
  1. To fix the memory usage size of the Elasticsearch server, we need to set up the same values for Xmsand Xmx in $ES_HOME/config/jvm.options (that is, we set 1 GB of memory in this case), as follows:
-Xms1g
-Xmx1g

How it works…

The standard limit of file descriptors (https://www.bottomupcs.com/file_descriptors.xhtml ) (maximum number of open files for a user) is typically 1,024 or 8,096. When you store a lot of records in several indices, you run out of file descriptors very quickly, so your Elasticsearch server becomes unresponsive and your indices may become corrupted, causing you to lose your data.

Changing the limit to a very high number means that your Elasticsearch doesn't hit the maximum number of open files.

The other setting for memory prevents Elasticsearch from swapping memory and give a performance boost in a environment. This setting is required because, during indexing and searching, Elasticsearch creates and destroys a lot of objects in memory. This large number of create/destroy actions fragments the memory and reduces performance. The memory then becomes full of holes and, when the system needs to allocate more memory, it suffers an overhead to find compacted memory. If you don't set bootstrap.memory_lock: true, Elasticsearch dumps the whole process memory on disk and defragments it back in memory, freezing the system. With this setting, the defragmentation step is done all in memory, with a huge performance boost.

You have been reading a chapter from
Elasticsearch 7.0 Cookbook - Fourth Edition
Published in: Apr 2019
Publisher: Packt
ISBN-13: 9781789956504
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image