Configuring Hadoop with Kerberos authentication
Once the Kerberos setup is completed and the user principals are added to KDC, we can configure Hadoop to use Kerberos authentication. It is assumed that a Hadoop cluster in a non-secured mode is configured and available. We will begin the configuration using Cloudera Distribution of Hadoop (CDH4).
The steps involved in configuring Kerberos authentication for Hadoop are shown in the following figure:
Setting up the Kerberos client on all the Hadoop nodes
In each of the Hadoop node (master node and slave node), we need to install the Kerberos client. This is done by installing the client packages and libraries on the Hadoop nodes.
For RHEL/CentOS/Fedora, we will use the following command:
yum install krb5-libs krb5-workstation
For Ubuntu, we will use the following command:
apt-get install krb5-user
Setting up Hadoop service principals
In CDH4, there are three users (hdfs
, mapred
, and yarn
) that are used to run the various Hadoop daemons. All the...