In this chapter, we have gone through different activities performed by Hadoop administrators for monitoring and optimizing the Hadoop cluster. We looked at the roles and responsibilities of an administrator, followed by cluster planning. We did a deep dive into key management aspects of the hadoop cluster, such as resource management through job scheduling with algorithms such as Fair Scheduler and Capacity Scheduler. We also looked at ensuring high availability and security for the Apache hadoop cluster. This was followed by the day-to-day activities of Hadoop administrators, covering adding new nodes, archiving, hadoop Metric, and so on.
In the next chapter, we will look at Hadoop ecosystem components, which help the business develop big data applications rapidly.