Ceph is largely autonomous in taking care of itself and recovering from failure scenarios, but in some cases human intervention is required. This chapter will look at such common errors and failure scenarios and how to bring Ceph back to working by troubleshooting them. You will learn the following topics:
- How to correctly repair inconsistent objects
- How to solve problems with the help of peering
- How to deal with near_full and too_full OSDs
- How to investigate errors via Ceph logging
- How to investigate poor performance
- How to investigate PGs in a down state