Apache Ranger
Apache Ranger is a security framework which lets you define the policies to control the data access in Hadoop. It provides a web-based console that can be used by the system administrators of the Hadoop cluster to define and activate the access policies. Apache Ranger understands how different tools interact with Hadoop and lets you define permissions accordingly. For example, for Hive data, you can define whether a user is allowed to create or drop a table or read a column using Apache Ranger.
Apache Ranger also maintains an audit log and analytics data, which is useful information for risk and compliance personnel to access from a web-based console.
These features of Apache Ranger make it an important technology to use while building a data lake.
The examples of Apache Ranger in this chapter use the version 0.5.0, which can be downloaded from the website of the Apache organization.
Installing Apache Ranger
Before the installation of Apache Ranger, you need to make sure that you...