Summary
In this chapter, we looked at the Kerberos authentication protocol and understood the key concepts involved in implementing Kerberos. We understood the default security implementation in Hadoop and how a Hadoop process gets the logged in user and group details. The default security implementation has many gaps and can't be used in production.
In a production scenario, securing Hadoop with Kerberos is essential. So we looked at the requirements that Hadoop supports at the user and Hadoop service level to secure the Hadoop cluster. We looked at the various internal secret keys (Delegation Token, Block Access Token, and Job Token) that are exchanged by the various Hadoop processes to ensure a secured ecosystem. Understanding the need and use of these tokens is vital to debug and troubleshoot any configuration issues in a secured Hadoop cluster. In the next chapter we will detail the procedure for securing a Hadoop cluster.