Diagnosing problems
Elasticsearch node failures can manifest in many different ways. Some of the symptoms of node failures are as follows:
A node crashes during heavy data indexing
Elasticsearch process stops running for an unknown reason
A cluster won't recover from a yellow or red state
Query requests time out
Index requests time out
When a node in your cluster experiences problems such as these, it can be tempting to just restart Elasticsearch or the node itself and move on like nothing happened. However, without addressing the underlying issue, the problem is likely to resurface in the future. If you encounter scenarios such as the ones just listed, check the health of your cluster in the following manner:
Check the cluster health with Elasticsearch-head or Kopf
Check the historical health with Marvel
Check for Nagios alerts
Check Elasticsearch log files
Check system log files
Check the system health using command-line tools
These steps will help diagnose the root cause of problems in your cluster...