Docker Swarm already provides almost everything we need from a system that self-heals services.
What follows is a short demonstration of some of the scenarios the system might encounter when facing failed service replicas. I already warned you that at least basic knowledge of operating Swarm is the pre-requirement for this book so I chose to skip a lengthy discussion about the features behind the scheduler. I won't go into details but only prove that Swarm guarantees that the services will (almost) always be healthy.
Let's see what happens when one of the three replicas of the go-demo_main service fails. We'll simulate it by stopping the primary process inside one of the replicas.
The first thing we need to do is find out the node where one of the replicas are running.
NODE=$(docker service ps \
-f desired-state=Running...