Understanding monitoring
Monitoring is defined by Google SRE as the action of collecting, processing, aggregating, and displaying real-time quantitative data about a system, such as query counts and types, error counts and types, processing times, and server lifetimes.
In simple terms, the essence of monitoring is to verify whether a service or an application is behaving as expected. Customers expect a service to be reliable and delivering the service to the customer is just the first step. But ensuring that the service is reliable should be the desired goal. To achieve this goal, it is important to explore key data, otherwise also known as metrics. Examples of some metrics can be tied to uptime, resource usage, network utilization, and application performance.
Monitoring is the means of exploring metric data and providing a holistic view of a system's health, which is a reflection of its reliability. Apart from metric data, monitoring can include data from text-based logging...