Designing for failure
Assuming things will fail, ensure you carefully review every aspect of your cloud architecture and design for failure scenarios against each one of them. In particular, assume hardware will fail, cloud data center outages will happen, database failure or performance degradation will occur, expected volumes of transactions will be exceeded, and so on. In addition, in an auto-scaled environment, for example, nodes may be shutdown in response to loads getting back to normal levels after a spike. Nodes might be rebooted by the cloud platform. There can also be unexpected application failures. In all cases, the design goal should be to handle such error conditions gracefully and minimize any impact to the user experience.
There should be a strong preference to minimize human or manual intervention. Hence, it is preferred to implement strategies using services made available by the cloud platform to reduce the chances of failures or automate recovery from such failures. For...