Architecting for resilience and business continuity
Keeping your applications running can be important for different reasons. Depending on your solution's nature, downtime can range from a loss of productivity to direct financial loss. Building systems that can withstand some form of failure has always been a critical aspect of architecture, and with the cloud, there are more options available to us.
Building resilient solutions comes at a cost; therefore, you need to balance the cost of an outage against the cost of preventing it.
High Availability (HA) is the traditional option and essentially involves doubling up on components so that if one fails, the other automatically takes over. An example might be a database server—building two or more nodes in a cluster with data replication between them protects against one of those servers failing as traffic would be redirected to the secondary replica in the event of a failure, as per the example in the following diagram...