Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
ElasticSearch Cookbook

You're reading from   ElasticSearch Cookbook As a user of ElasticSearch in your web applications you'll already know what a powerful technology it is, and with this book you can take it to new heights with a whole range of enhanced solutions from plugins to scripting.

Arrow left icon
Product type Paperback
Published in Dec 2013
Publisher Packt
ISBN-13 9781782166627
Length 422 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Alberto Paro Alberto Paro
Author Profile Icon Alberto Paro
Alberto Paro
Arrow right icon
View More author details
Toc

Table of Contents (14) Chapters Close

Preface 1. Getting Started FREE CHAPTER 2. Downloading and Setting Up ElasticSearch 3. Managing Mapping 4. Standard Operations 5. Search, Queries, and Filters 6. Facets 7. Scripting 8. Rivers 9. Cluster and Nodes Monitoring 10. Java Integration 11. Python Integration 12. Plugin Development Index

Understanding cluster, replication, and sharding

Related to shard management, there is the key concept of replication and cluster status.

Getting ready

You need one or more nodes running to have a cluster. To test an effective cluster you need at least two nodes (they can be on the same machine).

How it works...

An index can have one or more replicas—the shards are called primary if they are part of the master index and secondary if they are part of replicas.

To maintain consistency in write operations the following workflow is executed:

  1. The write is first executed in the primary shard.
  2. If the primary write is successfully done, it is propagated simultaneously in all the secondary shards.
  3. If a primary shard dies, a secondary one is elected as primary (if available) and the flow is re-executed.

During search operations, a valid set of shards is chosen randomly between primary and secondary to improve performances.

The following figure shows an example of possible shards configuration:

How it works...

Best practice

In order to prevent data loss and to have High Availability, it's good to have at least one replica so that your system can survive a node failure without downtime and without loss of data.

There's more…

Related to the concept of replication there is the cluster indicator of the health of your cluster.

It can cover three different states:

  • Green: Everything is ok.
  • Yellow: Something is missing but you can work.
  • Red: "Houston, we have a problem". Some primary shards are missing.

How to solve the yellow status

Mainly yellow status is due to some shards that are not allocated. If your cluster is in recovery status, just wait if there is enough space in nodes for your shards.

If your cluster, even after recovery is still in yellow state, it means you don't have enough nodes to contain your replicas so you can either reduce the number of your replicas or add the required number of nodes.

Best practice

The total number of nodes must not be lower than the maximum number of replicas.

How to solve the red status

When you have lost data (that is, one or more shard is missing), you need to try restoring the node(s) that are missing. If your nodes restart and the system goes back to yellow or green status you are safe. Otherwise, you have lost data and your cluster is not usable. In this case, delete the index/indices and restore them from backup (if you have it) or from other sources.

Best practice

To prevent data loss, I suggest having always at least two nodes and the replica set to 1.

Tip

Having one or more replicas on different nodes on different machines allows you to have a live backup of your data, always updated.

See also

  • Replica and shard management in this chapter.
You have been reading a chapter from
ElasticSearch Cookbook
Published in: Dec 2013
Publisher: Packt
ISBN-13: 9781782166627
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime