Summary
In this chapter, you learned about various principles to make your system reliable. These principles include making your system self-healing by applying automation rules and reducing the impact in the event of failure by designing a distributed system where the workload spans multiple resources.
Overall system reliability heavily depends on your system’s availability and ability to recover from disaster events. You learned about synchronous and asynchronous data replication types and how they affect your system reliability. You learned about various data replication methods, including array-based, network-based, host-based, and hypervisor-based methods. Each replication method has its pros and cons. There are multiple vendors’ products available to achieve the desired data replication.
You learned about various disaster planning methods depending on the organization’s needs and the RTO and RPO. You learned about the backup and restore method, which...