Minimizing disruptions
During deployments, there are often tasks that can be considered disruptive or destructive. These tasks may include restarting services, performing database migrations, and so on. Disruptive tasks should be clustered together to minimize the overall impact on an application, while destructive tasks should only be performed once.
Delaying a disruption
Restarting services for a new code version is a very common need. When viewed in isolation, a single service can be restarted whenever the code and configuration for the application has changed, without concern for the overall distributed system health. Typically, a distributed system will have roles for each part of the system, and each role will operate essentially in isolation on the hosts targeted to perform those roles. When deploying an application for the first time, there is no existing uptime of the whole system to worry about; so, services can be restarted at will. However, during an upgrade, it may be desirable...