Chapter 10: Managing Operations in Hybrid Cloud Infrastructure
Building a cloud platform and moving all workloads to the platform is not enough. The cloud platform needs continuous management. Each cloud provider provides a wide range of managed services that the cloud provider manages. However, organizations still need to take care of a range of responsibilities to ensure the cloud platform works smoothly.
In this chapter, we will discuss the major areas of operations on the cloud. Here is a list of topics that we will cover:
- Understanding the pillars of Site Reliability Engineering (SRE)
- Exploring the platform engineering service
- Understanding Key Performance Indicators (KPIs) for the platform service
- Understanding FinOps
- Understanding reference architectures for MLOps
- Understanding the IBM reference architecture for incident management
- Understanding the IBM reference architecture for problem management
- Measuring Operational Readiness Reviews...