High availability, in simple terms, means achieving very close to 100 percent system uptime by ensuring that there is no single point of failure. This is typically done by incorporating redundancy mechanisms, such as backup processes; taking our instances from the failed ones; and so on. So, by design, Mesos provides a fault-tolerant environment for running applications. The master daemon and the slave daemon operate in a distributed and highly available manner, ensuring that no one component can cause an outage of the entire cluster.
Let's learn about high availability. To ensure that Mesos is highly available to the application that uses its cluster manager, the Mesos master uses a single leader and multiple standby masters, which are ready to take over in the event that the leader's master fails. The master uses a ZooKeeper to coordinate leadership...