The cost of more reliability as a business decision
I wish every outage came with a six-figure remediation budget to shore up our code, systems, and architecture – trust me, you’d be hard-pressed to see this once in your career as a rockstar SRE.
The truth is that reliability costs – and cost often comes down to being a business decision. With that in mind, we’ll not only discuss the technical side of these reliability options but also, more importantly, the cost of both outage time and budget.
Active:Active
A very basic and highly effective strategy is to create two copies of an architecture and put them in different data centers or regions. You use some type of load balancing that distributes load between the two. The downside is that for zero-impact outage deployments, you still require retries in your applications. Also, you have to have enough capacity online, often called spinning capacity, to accept the entire load in both data centers and...