Configuring SSL in Hadoop
In this recipe, we will configure SSL for Hadoop services. We can configure SSL for Web UI, WebHDFS, YARN, shuffle phase, RPC, and so on. The important components for enabling SSL are certificates, keystore, and truststore. These must individually be kept secure and safe.
We can have SSL single or two-way, but the preferred method is a single way in which the clients validate the server's identity. Using 2-way SSL increases latency and involves configuration overhead.
Getting ready
To complete this recipe, the user must have a running cluster with HDFS and YARN setup. The users can refer to Chapter 1, Hadoop Architecture and Deployment for installation details.
The assumption here is that the user is very familiar with HDFS concepts and knows its layout, and is also familiar with how SSL works, with experience of creating SSL certificates. For this recipe, we will be using self-signed certificates, but for production it is recommended to use a proper CA-signed...