Best methods used in monitoring
As was said in the introduction of this chapter, monitoring is usually an afterthought of application development and deployment, although it is a major process and is a major part of the Site Reliability Engineer (SRE) role. A major purpose of this role is to ensure that systems maintain high availability and reliability. One of the pillars of making sure a system is highly available and reliable is to ensure that there is proper monitoring and observability of the system. The SRE role goes beyond configuring monitoring tools; SREs bring a lot of automation into the work being done, meaning that some programming/scripting knowledge is needed to be a good SRE.
SREs are also involved in building and designing a process for how incidents, escalations, and downtimes are handled in the system. They are the ones that work with businesses and other departments to set service-level agreements (SLAs), service-level objectives (SLOs), and service-level indicators...