Introduction
In ElasticSearch ecosystem, it can be immensely useful to monitor nodes and cluster to control and improve their performances and states. There are several scenarios involved in problems at cluster level such as the following:
- Node overheads occur when some nodes can have too many shards allocated and become a bottleneck of all cluster.
- Node shutdown can happen due to a lot of reasons, for example full disk, hardware problem and power problems.
- Shard relocation problems or corruptions in which some shards are unable to become in online status may happen.
- If a shard is too big, the index performance decreases due to Lucene massive segments merging.
- Empty indices and shards only waste memory and storage, but because every shard has a lot of active thread, if there is a huge number of unused indices and shards, the general cluster performance is degraded.
Detecting malfunction or bad performances can be done via API or via some frontend plugins that can be activated in ElasticSearch...