There are many types of failures that can happen at any time. These could be a result of disk failures, power outages, natural disasters, software errors, and human errors. In addition, there are several points of failure in any given cloud application. These could include DNS or domain services, load balancers, web and application servers, database servers, application services-related failures, data center-related failures, and so on. You will need to ensure that you have a mitigation strategy for each of these types or points of failure.
It is highly recommended that you automate your recovery strategy and thoroughly test as many of these processes as possible.
Currently, the AWS Cloud operates 44 Availability Zones (AZs) within 16 geographic Regions around the world, with announced plans for 17 more Availability Zones and six more regions. The high-speed...