Introduction
From a MongoDB developer perspective, it is probably true that the MongoDB database server is some sort of black box, living somewhere in the cloud or in a data center room. Details are not important if the database is up and running when needed. From a business perspective though, things look slightly different. For example, when a production application needs to be available online for customers 24/7, those details are very important. Any outage can have a negative impact on service availability for customers, and ultimately, if the failure is not recovered quickly, the business' financial results.
Outages happen from time to time, and they can be attributed to a wide variety of reasons. These are often the result of common hardware failures, such as disk or memory failures, but they may also be caused by network failures, software failures, or even application failures. For example, a software failure such as an OS bug can render the server unresponsive to users...