Implementing reliability
Reliability can be defined as the ability of a system or component to function under stated conditions for a specified period. Reliability focuses on preventing failures during the lifetime of the product or system, from commissioning to decommissioning. There are different patterns for reliability, as detailed next.
Redundant resources
This is an expensive pattern for reliability. In this pattern, the target platform is replicated in multiple geographic locations. Each of the components in the target platform is replicated multiple times, as illustrated in the following diagram:
Figure 9.5 – Redundancy of resources in multiple locations
Figure 9.5 shows a reliable system where the redundancy-of-resources concept is applied. This kind of system is robust and can survive individual failure. If any one of the replicas of the target platform is unavailable, another system can be used.
Retrying until N times
In...