Chapter 5: Fault Tolerance and Auto-Rebalancing
In Chapter 4, Geo-Partitioning, we learned about what geo-partitioning is, why we need it, and how it's supported in CockroachDB.
In this chapter, we will discuss what fault tolerance and auto-rebalancing are and how CockroachDB provides these features. We will also learn about multi-node failure scenarios and how to recover from them.
Fault tolerance refers to how CockroachDB copes with various types of failures. Auto-rebalancing in general is the ability to adapt and increase or decrease the number of nodes in a cluster to avoid hotspots. We will discuss auto-rebalancing with specific examples that you can also try.
The following topics will be covered in this chapter:
- Achieving fault tolerance
- Automatic rebalancing
- Recovering from multi-node failures