Index
A
- Apache Bigtop project
- URL / Setting up NameNode
- Authentication Server (AS) / Kerberos overview
- automatic failover option / Setting up NameNode
B
- bigtop-jsvc package / Setting up NameNode
- bigtop-utils package / Setting up NameNode
C
- -chmod command / HDFS security
- CapacityTaskScheduler
- about / CapacityTaskScheduler
- CDH 4.1 / Setting up NameNode
- CDH HA guide
- URL / JobTracker configuration
- CDH repositories
- setting up / Setting up the CDH repositories
- CDH repository
- used, for installing Sqoop / Installing and configuring Sqoop
- check_ping plugin / NameNode checks
- CLI / Installing the EMR command-line interface
- client, Hive
- installing / Installing the Hive client
- clients, Kerberos
- configuring / Configuring Kerberos clients
- Cloudera documentation
- on Impala, URL / Installing Impala state store
- Cloudera Hadoop distribution
- about / Cloudera Hadoop distribution
- cluster administrator
- about / MapReduce security
- core-site.xml file / Hadoop configuration files, NameNode HA configuration, core-site.xml
- CorruptedBlocks variable / NameNode checks
D
- --describe option / Launching the EMR cluster
- --driver option / Sqoop import example
- DataNode
- hardware, selecting / Choosing the DataNode hardware
- about / Kerberos in Hadoop
- DataNode configuration
- about / DataNode configuration
- TaskTracker configuration / TaskTracker configuration
- Hadoop tuning / Advanced Hadoop tuning
- DataNode metrics
- URL / JMX Metrics
- dfs.client.failover.proxy.provider.sample-cluster variable / NameNode HA configuration
- dfs.data.dir variable / DataNode configuration
- dfs.datanode.balance.bandwidthPerSec variable / hdfs-site.xml
- dfs.ha.fencing.method / NameNode HA configuration
- dfs.journalnode.edits.dir variable / NameNode HA configuration
- dfs.namenode.replication.min setting / DataNode configuration
- dfs.namenode.shared.edits.dir variable / NameNode HA configuration
- dfs.nameservices variable / NameNode HA configuration
E
- elastic-mapreduce CLI / Choosing the Hadoop version
- EMR
- about / Amazon Elastic MapReduce
- command-line interface, installing / Installing the EMR command-line interface
- EMR cluster
- launching / Launching the EMR cluster
- master instance / Launching the EMR cluster
- terminating / Launching the EMR cluster
- temporary EMR clusters / Temporary EMR clusters
- input and output locations, preparing / Preparing input and output locations
- EMR Web console
- URL / Launching the EMR cluster
- EXT4 filesystem / Choosing and setting up the filesystem
F
- Failover Controller
- installing / JournalNode, ZooKeeper, and Failover Controller
- FairScheduler
- about / FairScheduler, MapReduce security
- filesystem
- setting up / Choosing and setting up the filesystem
- flex_bg option / Choosing and setting up the filesystem
G
- Gangila
- Hadoop, monitoring with / Monitoring Hadoop with Ganglia
- Gateway servers
- about / Gateway and other auxiliary services
H
- -hadoop-version option / Choosing the Hadoop version
- ha.zookeeper.quorum variable / NameNode HA configuration
- Hadoop
- cluster hardware, selecting / Choosing Hadoop cluster hardware
- hardware, summary / Hadoop hardware summary
- distributions / Hadoop distributions
- versions / Hadoop versions
- distribution, selecting / Choosing Hadoop distribution
- Cloudera Hadoop distribution / Cloudera Hadoop distribution
- Hortonworks Hadooop distribution / Hortonworks Hadoop distribution
- MapR / MapR
- configuration files / Hadoop configuration files
- table, importing from MySQL / Sqoop import example
- security, overview / Hadoop security overview
- Service Level Authorization / Hadoop Service Level Authorization
- and Kerberos / Hadoop and Kerberos
- Kerberos / Kerberos in Hadoop
- metrics / Hadoop Metrics
- monitoring, with Nagios / Monitoring Hadoop with Nagios
- monitoring, with Gangila / Monitoring Hadoop with Ganglia
- hadoop-hdfs-datanode package / DataNode configuration
- hadoop-hdfs package / Setting up NameNode
- hadoop-metrics.properties file / Hadoop configuration files, Hadoop Metrics
- hadoop-metrics2.properties file / Hadoop configuration files
- Hadoop cluster
- hardware, selecting / Choosing Hadoop cluster hardware
- data sources, identifying / Choosing the DataNode hardware
- data growth rate, estimating / Choosing the DataNode hardware
- estimated storage requirements, multiplying by replication factor / Choosing the DataNode hardware
- MapReduce temporary files and system data, factoring in / Choosing the DataNode hardware
- low storage density cluster / Low storage density cluster
- high storage density cluster / High storage density cluster
- NameNode hardware / The NameNode hardware
- JobTracker hardware / The JobTracker hardware
- Gateway servers / Gateway and other auxiliary services
- network, considerations / Network considerations
- OS, selecting for / Choosing OS for the Hadoop cluster
- OS, configuring for / Configuring OS for Hadoop cluster
- monitoring strategy / Monitoring strategy overview
- Hadoop cluster hardware
- selecting / Choosing Hadoop cluster hardware
- Hadoop Distributed File System (HDFS) / Setting up NameNode
- Hadoop ecosystem
- hosting / Hosting the Hadoop ecosystem
- hadoop jar command / TaskTracker configuration
- hadoop package / Setting up NameNode
- Hadoop tuning
- hdfs-site.xml / hdfs-site.xml
- mapred-site.xml / mapred-site.xml
- core-site.xml / core-site.xml
- Hadoop version
- selecting / Choosing the Hadoop version
- HDFS
- security / HDFS security
- Kerberos, enabling for / Enabling Kerberos for HDFS
- monitoring / Monitoring HDFS
- hdfs-site.xml file / Hadoop configuration files, hdfs-site.xml
- hdfs balancer command / hdfs-site.xml
- hdfs command-line client tool / DataNode configuration
- high storage density cluster
- about / High storage density cluster
- Hive
- about / Hive
- architecture / Hive architecture, Installing Hive Metastore
- Metastore, installing / Installing Hive Metastore
- client, installing / Installing the Hive client
- Server, installing / Installing Hive Server
- HiveQL / Hive
- Hortonworks Hadooop distribution
- about / Hortonworks Hadoop distribution
I
- Impala
- about / Impala
- architecture / Impala architecture
- state store, installing / Installing Impala state store
- server, installing / Installing the Impala server
- server, starting / Installing the Impala server
- using, in command line / Installing the Impala server
- server, connecting to / Installing the Impala server
- import command / Sqoop export example
J
- Java versions, Hadoop
- URL / Setting up Java Development Kit
- JBOD (Just a Bunch of Disks) / Choosing the DataNode hardware
- JMX metrics
- about / JMX Metrics
- JobQueueTaskScheduler
- about / JobQueueTaskScheduler
- job scheduler
- configuring / Configuring the job scheduler
- JobTracker
- hardware / The JobTracker hardware
- package, installing / JobTracker configuration
- JobTracker checks
- host-level checks / JobTracker checks
- service-level checks / JobTracker checks
- JobTracker configuration
- about / JobTracker configuration
- job scheduler, configuring / Configuring the job scheduler
- FairScheduler / FairScheduler
- CapacityTaskScheduler / CapacityTaskScheduler
- JournalNode
- about / Setting up NameNode
- installing, on server / JournalNode, ZooKeeper, and Failover Controller
- JournalNode checks
- host-level resources / JournalNode checks
K
- Kerberos
- about / Hadoop security overview, Kerberos overview
- and Hadoop / Hadoop and Kerberos
- principal, example / Kerberos overview
- in Hadoop / Kerberos in Hadoop
- clients, configuring / Configuring Kerberos clients
- principals, generating / Generating Kerberos principals
- enabling, for HDFS / Enabling Kerberos for HDFS
- enabling, for MapReduce / Enabling Kerberos for MapReduce
- Key Distribution Center (KDC) / Kerberos overview
L
- Linux Alternatives
- URL / Hadoop configuration files
- log4j.properties configuration file / Hadoop configuration files
- low storage density cluster
- about / Low storage density cluster
M
- -m 0 option / Choosing and setting up the filesystem
- manual failover option / Setting up NameNode
- MapR
- about / MapR
- mapred-site.xml file / Hadoop configuration files, mapred-site.xml
- mapred.java.child.opts variable / TaskTracker configuration
- mapred.job.racker.handler.count variable / mapred-site.xml
- mapred.local.dir directory / JobTracker configuration
- MapReduce
- about / JobTracker configuration
- security / MapReduce security
- Kerberos, enabling for / Enabling Kerberos for MapReduce
- monitoring / Monitoring MapReduce
- MasterPublicDnsName field / Launching the EMR cluster
- Metastore, Hive
- installing / Installing Hive Metastore
- starting / Installing Hive Metastore
- Metastore service / Hive architecture
- Metrics2 / Hadoop Metrics
- MissingBlocks status variable / NameNode checks
- mntr command / ZooKeeper checks
- MySQL
- table, importing to Hadoop / Sqoop import example
- MySQL JDBC driver
- URL, for downloading / Installing and configuring Sqoop
N
- --num-mappers option / Sqoop import example
- Nagios
- Hadoop, monitoring with / Monitoring Hadoop with Nagios
- documentation, URL / Monitoring Hadoop with Nagios
- NameNode
- hardware / The NameNode hardware
- setting up / Setting up NameNode
- manual failover option / Setting up NameNode
- automatic failover option / Setting up NameNode
- HA configuration / NameNode HA configuration
- about / Kerberos in Hadoop
- NameNode checks
- about / NameNode checks
- NumDeadDataNodes status variables / NameNode checks
O
- -O extent,sparse_super,flex_bg option / Choosing and setting up the filesystem
- Oracle JDK
- URL, for downloading / Setting up Java Development Kit
- OS
- selecting, for Hadoop cluster / Choosing OS for the Hadoop cluster
- OS configuration
- for Hadoop cluster / Configuring OS for Hadoop cluster
- filesystem, setting up / Choosing and setting up the filesystem
- filesystem, selecting / Choosing and setting up the filesystem
- Java Development Kit, setting up / Setting up Java Development Kit
- other settings / Other OS settings
- CDH repositories, setting up / Setting up the CDH repositories
P
- principal, components
- primary component / Kerberos overview
- secondary component / Kerberos overview
- realm component / Kerberos overview
- principals, Kerberos
- generating / Generating Kerberos principals
Q
- queue administrator
- about / MapReduce security
- quorum / Monitoring strategy overview
- Quorum Journal Manager / Setting up NameNode
R
- RAID / Choosing the DataNode hardware
- repository
- adding / Setting up the CDH repositories
S
- S3 documentation
- URL / Preparing input and output locations
- SELECT statement / Sqoop export example
- server, Hive
- installing / Installing Hive Server
- server, Impala
- installing / Installing the Impala server
- service level authorization
- about / Hadoop Service Level Authorization
- shell. sshfence / NameNode HA configuration
- sink / Hadoop Metrics
- slaves file / Hadoop configuration files
- source / Hadoop Metrics
- sparse_super option / Choosing and setting up the filesystem
- split-brain / NameNode HA configuration
- Sqoop
- about / Sqoop
- installing / Installing and configuring Sqoop
- configuring / Installing and configuring Sqoop
- installing, CDH repository used / Installing and configuring Sqoop
- import, example / Sqoop import example
- export, example / Sqoop export example
- sshfence / NameNode HA configuration
- State field / Launching the EMR cluster
- state store, Impala
- about / Impala architecture
- installing / Installing Impala state store
T
- TaskTracker
- configuring / TaskTracker configuration
- Ticket-Granting Service (TGS) / Kerberos overview
- Ticket-Granting Ticket (TGT) / Kerberos overview
V
- -version option / Installing the EMR command-line interface
W
- --warehouse-dir option / Sqoop import example
- Whirr
- about / Using Whirr
- installing / Installing and configuring Whirr
- configuration files / Installing and configuring Whirr
Y
- yum command / Setting up NameNode
- yum package
- setting up / Setting up the CDH repositories
Z
- ZooKeeper
- about / Setting up NameNode, JournalNode, ZooKeeper, and Failover Controller
- service, starting / JournalNode, ZooKeeper, and Failover Controller
- Zookeeper checks
- about / ZooKeeper checks
- zookeeper package / Setting up NameNode