Introduction
In this chapter, we will configure backup, restore processes, logs, and recovery using Secondary Namenode. Despite high availability, it is very important to back up data for adverse situations, irrespective of the notion of having a Secondary / backup node running and syncing constantly from the Primary node.
In a master-slave architecture, if the slave is syncing some data from the master and the data on the master gets corrupted, the slave will most likely pull the same corrupted data and now we will have two bad copies of the data. Although there are checks in place to account for corrupt data using checksums, it is still for production-critical data and so there must always be a business continuity or recovery plan.