Points to remember
The following are some important points to remember:
- Black box monitoring is based on testing externally visible behavior.
- White box monitoring is based on the metrics collected and exposed by the internals of the system.
- Metrics must be specific, measurable, achievable, relevant, and time-bound.
- The four golden signals recommended for a user-facing system are latency, traffic, errors, and saturation.
- Latency can point to emerging issues, and traffic is historically used for capacity planning.
- Errors represent the rate of requests that fail explicitly or implicitly or by policy.
- Saturation represents how full the service is and reflects degrading performance.
- Precision is defined as the proportion of events detected that are significant.
- Recall is defined as the proportion of significant events detected.
- Detection time is defined as the time taken by a system to notice an alert condition.
- Reset time is defined as the...