Building Resilient Microservices
Coming off the heels of the Saga pattern, we can appreciate the value of having fail-safes built into our microservices application. We need to ensure that we adequately handle inevitable failures.
We can’t assume that our distributed microservices will always be up and running. We also can’t assume that our supporting infrastructure will be reliable. These considerations lead us down a path where we must anticipate the occurrence of failures, whether prolonged or transient.
A prolonged outage can be due to a downed server or service, some generally important part of the infrastructure. These tend to be easier to detect and mitigate since they have a more obvious impact on the runtime of the application. Transient failures are far more difficult to detect since they can last a few seconds to a few minutes at a time and aren’t usually tied to any obvious issue in the infrastructure. Something as simple as a service taking 5...