Summary
Monitoring is quite a large topic; a lot can be said about it. But what we have done in this chapter is to consider monitoring a natural human attitude and characteristic. The fact that anything that is built can fail means that there is a natural instinct to ensure that things are monitored, understood, and augmented to work better with time. We applied this concept to computing and explained that computing brings automation into this natural human process of monitoring, and we talked about the different components of monitoring computer systems. We covered logs, metrics, dashboards, and incidents and explained the meaning of each of these components. Next, we explained the importance of monitoring, pinpointing specific key reasons to monitor your application workload and infrastructure. Then, we moved on to explain Amazon CloudWatch, the AWS managed end-to-end monitoring service that is built with all of the features that any monitoring infrastructure or service will require. Lastly, the icing on the cake was the AWS Well-Architected framework, which serves as a boilerplate for everything cloud-native and monitoring is not left out.
This has given us a solid foundation to understand the fundamentals and components of monitoring and the importance of monitoring in the day-to-day activity of an SRE. We have also seen that CloudWatch is a managed service that takes away the operational expense of running our own cloud infrastructure. This foundational knowledge will be beneficial as we go deeper into this book.
In the next chapter, we will take our first step into Amazon CloudWatch to understand the components, events, and alarms.