Chapter 7: Monitors and Alerts
In the last chapter, we learned how infrastructure is monitored using Datadog. The modern, cloud-based infrastructure is far more complex and virtual than the data center-based, bare-metal compute, storage, and network infrastructure. Datadog is designed to work with cloud-centric infrastructure, and it meets most of the infrastructure monitoring needs out of the box, be it a bare-metal or public cloud-based infrastructure.
A core requirement of any monitoring application is to notify you about an ongoing issue. Ideally, before that issue results in a service outage. In previous chapters, we discussed metrics and how they are generated, viewed, and charted on dashboards. An important use of metrics is to predict an upcoming issue. For example, by tracking the system.disk.free
metric on a storage device, it is easy to notify when it reaches a certain point. By combining the system.disk.total
metric to that equation, it's also possible to track...