Audit logging in Hadoop
Audit logging is an accounting process that logs all operations happening in Hadoop. HDFS and the MapReduce engine logging are already present in Hadoop via the log4j
properties. Audit logs use the same framework, but they log more events and give higher resolution into Hadoop operations. The file that is used to configure logging is the log4j.properties
file.
By default, the log4j.properties
file has the log threshold set to WARN
. By setting this level to INFO
, audit logging can be turned on. The following snippet shows the log4j.properties
configuration when HDFS and MapReduce audit logs are turned on:
# # hdfs audit logging # hdfs.audit.logger=INFO,NullAppender hdfs.audit.log.maxfilesize=256MB hdfs.audit.log.maxbackupindex=20 log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger} log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender log4j...