Monitoring is a difficult mission, especially when it comes to a team of hundreds of engineers, where metrics overload can occur. To solve this problem, in addition to a time series-based anomaly detection ability, there are many projects that we can use. One of them is the Kale stack. It consists of two parts: Skyline and Oculus. The role of Skyline is to detect anomalous metrics (an anomaly detection system), while Oculus is the anomaly correlation component. To download the two components, you can check the following repositories:
- Skyline: http://github.com/etsy/skyline
- Oculus: http://github.com/etsy/oculus
You will need the following:
- At least 8 GB RAM
- Quad Core Xeon 5620 CPU, or comparable
- 1 GB disk space