You're reading from Datadog Cloud Monitoring Quick Start Guide Proactively create dashboards, write scripts, manage alerts, and monitor containers using Datadog

Product type Paperback

Published in Jun 2021

Publisher Packt

ISBN-13 9781800568730

Length 318 pages

Edition 1st Edition

Tools

Datadog

Concepts

Application Monitoring

Author (1):

Thomas Kurian Theakanath

View More author details

Table of Contents (19) Chapters

Preface

1. Section 1: Getting Started with Datadog

2. Chapter 1: Introduction to Monitoring FREE CHAPTER

3. Chapter 2: Deploying the Datadog Agent

4. Chapter 3: The Datadog Dashboard

5. Chapter 4: Account Management

6. Chapter 5: Metrics, Events, and Tags

7. Chapter 6: Monitoring Infrastructure

8. Chapter 7: Monitors and Alerts

9. Section 2: Extending Datadog

10. Chapter 8: Integrating with Platform Components

11. Chapter 9: Using the Datadog REST API

12. Chapter 10: Working with Monitoring Standards

13. Chapter 11: Integrating with Datadog

14. Section 3: Advanced Monitoring

15. Chapter 12: Monitoring Containers

16. Chapter 13: Managing Logs Using Datadog

17. Chapter 14: Miscellaneous Monitoring Topics

18. Other Books You May Enjoy

Proactive monitoring

Technically, monitoring is not part of a software system running in production. The applications in a software system can run without any monitoring tools rolled out. As a best practice, software applications must be decoupled from monitoring tools anyway.

This scenario sometimes results in taking software systems to production with minimal or no monitoring, which would eventually result in issues going unnoticed, or, in the worst-case scenario, users of those systems discovering those issues while using the software service. Such situations are not good for the business due to these reasons:

An issue in production impacts the business continuity of the customers and, usually, there would be a financial cost associated with it.
Unscheduled downtime of a software service would leave a negative impression on the users about the software service and its provider.
Unplanned downtime usually creates chaos at the business level and triaging and resolving such issues can be stressful for everyone involved and expensive to the businesses impacted by it.

One of the mitigating steps taken in response to a production issue is adding some monitoring so the same issue will be caught and reported to the operations team. Usually, such a reactive approach increases the coverage of monitoring organically, but not following a monitoring strategy. While such an approach will help to catch issues sometimes, there is no guarantee that an organically grown monitoring infrastructure would be capable of checking the health of the software system and warning about it, so remediation steps can be taken proactively to minimize outages in the future.

Proactive monitoring refers to rolling out monitoring solutions for a software system to report on issues with the components of the software system, and the infrastructure the system runs on. Such reporting can help with averting an impending issue by taking mitigating steps manually or automatically. The latter method is usually called self-healing, a highly desirable end state of monitoring, but hard to implement.

The key aspects of a proactive monitoring strategy are as follows.

Implementing a comprehensive monitoring solution

Traditionally, the focus of monitoring has been the infrastructure components – compute, storage, and network. As you will see later in this chapter, there are more aspects of monitoring that would make the list complete. All relevant types of monitoring have to be implemented for a software system so issues with any component, software, or infrastructure would be caught and reported.

Setting up alerts to warn of impending issues

The monitoring solution must be designed to warn of impending issues with the software system. This is easy with infrastructure components as it is easy to track metrics such as memory usage, CPU utilization, and disk space, and alert on any usage over the limits.

However, such a requirement would be tricky at the application level. Sometimes applications can fail on perfectly configured infrastructure. To mitigate that, software applications should provide insights into what is going under the hood. In monitoring jargon, it is called observability these days and we will see later in the book how that can be implemented in Datadog.

Having a feedback loop

A mature monitoring system warning of impending issues that would help to take mitigation steps is not good enough. Such warnings must also be used to resolve issues automatically (for example, spinning off a new virtual machine with enough disk space when an existing virtual host runs out of disk space), or be fed into the redesigning of the application or infrastructure to avoid the issue from happening in the future.

You're reading from Datadog Cloud Monitoring Quick Start Guide Proactively create dashboards, write scripts, manage alerts, and monitor containers using Datadog

Table of Contents (19) Chapters

Proactive monitoring

Implementing a comprehensive monitoring solution

Setting up alerts to warn of impending issues

Having a feedback loop

Authors (1)

Other recommended products

Personalised recommendations for you