Data in a Data Lake is highly critical for the organization and it has to be secured at all times. In addition, to meet various regulatory and security policies standards within an organization, encryption of data is a must along with authentication and authorization. Encryption should be done to:
- Data at rest and
- Data in transit
The following figure shows both the data in rest and in transit and how encryption enables securing the data:
Figure 15: Data Encryption
Before we enable authentication and authorization, it's important to secure the channel through that the credentials would pass through. For this the channel should be secured paving way for data in transit to be transferred in an encrypted fashion. Various technologies in the Hadoop ecosystem communicated with one another using a variety of protocols such as RPC, TCP/IP, HTTP(S) and so on According to the protocol,...