The job of a system that self-heals services is to make sure that they are (almost) always running according to the design. Such a system needs to monitor the state of the cluster and continuously ensure that all the services are running the specified number of replicas. If one of them stops, the system should start a new one. If a whole node goes does, all the replicas that were running on that node should be scheduled to run across the healthy nodes. As long as the capacity of the cluster can host all the replicas, such a system should be able to maintain the defined specifications.
Having a system that self-heals services does not mean that it provides high-availability. If a replica stops being operational, the system will bring it back into the running state. However, there will be a (very) short period between a failure and until the system...