Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Hadoop Cluster Deployment

You're reading from   Hadoop Cluster Deployment Construct a modern Hadoop data platform effortlessly and gain insights into how to manage clusters efficiently

Arrow left icon
Product type Paperback
Published in Nov 2013
Publisher Packt
ISBN-13 9781783281718
Length 126 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Danil Zburvisky Danil Zburvisky
Author Profile Icon Danil Zburvisky
Danil Zburvisky
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Index

A

  • Apache Bigtop project
    • URL / Setting up NameNode
  • Authentication Server (AS) / Kerberos overview
  • automatic failover option / Setting up NameNode

B

  • bigtop-jsvc package / Setting up NameNode
  • bigtop-utils package / Setting up NameNode

C

  • -chmod command / HDFS security
  • CapacityTaskScheduler
    • about / CapacityTaskScheduler
  • CDH 4.1 / Setting up NameNode
  • CDH HA guide
    • URL / JobTracker configuration
  • CDH repositories
    • setting up / Setting up the CDH repositories
  • CDH repository
    • used, for installing Sqoop / Installing and configuring Sqoop
  • check_ping plugin / NameNode checks
  • CLI / Installing the EMR command-line interface
  • client, Hive
    • installing / Installing the Hive client
  • clients, Kerberos
    • configuring / Configuring Kerberos clients
  • Cloudera documentation
    • on Impala, URL / Installing Impala state store
  • Cloudera Hadoop distribution
    • about / Cloudera Hadoop distribution
  • cluster administrator
    • about / MapReduce security
  • core-site.xml file / Hadoop configuration files, NameNode HA configuration, core-site.xml
  • CorruptedBlocks variable / NameNode checks

D

  • --describe option / Launching the EMR cluster
  • --driver option / Sqoop import example
  • DataNode
    • hardware, selecting / Choosing the DataNode hardware
    • about / Kerberos in Hadoop
  • DataNode configuration
    • about / DataNode configuration
    • TaskTracker configuration / TaskTracker configuration
    • Hadoop tuning / Advanced Hadoop tuning
  • DataNode metrics
    • URL / JMX Metrics
  • dfs.client.failover.proxy.provider.sample-cluster variable / NameNode HA configuration
  • dfs.data.dir variable / DataNode configuration
  • dfs.datanode.balance.bandwidthPerSec variable / hdfs-site.xml
  • dfs.ha.fencing.method / NameNode HA configuration
  • dfs.journalnode.edits.dir variable / NameNode HA configuration
  • dfs.namenode.replication.min setting / DataNode configuration
  • dfs.namenode.shared.edits.dir variable / NameNode HA configuration
  • dfs.nameservices variable / NameNode HA configuration

E

  • elastic-mapreduce CLI / Choosing the Hadoop version
  • EMR
    • about / Amazon Elastic MapReduce
    • command-line interface, installing / Installing the EMR command-line interface
  • EMR cluster
    • launching / Launching the EMR cluster
    • master instance / Launching the EMR cluster
    • terminating / Launching the EMR cluster
    • temporary EMR clusters / Temporary EMR clusters
    • input and output locations, preparing / Preparing input and output locations
  • EMR Web console
    • URL / Launching the EMR cluster
  • EXT4 filesystem / Choosing and setting up the filesystem

F

  • Failover Controller
    • installing / JournalNode, ZooKeeper, and Failover Controller
  • FairScheduler
    • about / FairScheduler, MapReduce security
  • filesystem
    • setting up / Choosing and setting up the filesystem
  • flex_bg option / Choosing and setting up the filesystem

G

  • Gangila
    • Hadoop, monitoring with / Monitoring Hadoop with Ganglia
  • Gateway servers
    • about / Gateway and other auxiliary services

H

  • -hadoop-version option / Choosing the Hadoop version
  • ha.zookeeper.quorum variable / NameNode HA configuration
  • Hadoop
    • cluster hardware, selecting / Choosing Hadoop cluster hardware
    • hardware, summary / Hadoop hardware summary
    • distributions / Hadoop distributions
    • versions / Hadoop versions
    • distribution, selecting / Choosing Hadoop distribution
    • Cloudera Hadoop distribution / Cloudera Hadoop distribution
    • Hortonworks Hadooop distribution / Hortonworks Hadoop distribution
    • MapR / MapR
    • configuration files / Hadoop configuration files
    • table, importing from MySQL / Sqoop import example
    • security, overview / Hadoop security overview
    • Service Level Authorization / Hadoop Service Level Authorization
    • and Kerberos / Hadoop and Kerberos
    • Kerberos / Kerberos in Hadoop
    • metrics / Hadoop Metrics
    • monitoring, with Nagios / Monitoring Hadoop with Nagios
    • monitoring, with Gangila / Monitoring Hadoop with Ganglia
  • hadoop-hdfs-datanode package / DataNode configuration
  • hadoop-hdfs package / Setting up NameNode
  • hadoop-metrics.properties file / Hadoop configuration files, Hadoop Metrics
  • hadoop-metrics2.properties file / Hadoop configuration files
  • Hadoop cluster
    • hardware, selecting / Choosing Hadoop cluster hardware
    • data sources, identifying / Choosing the DataNode hardware
    • data growth rate, estimating / Choosing the DataNode hardware
    • estimated storage requirements, multiplying by replication factor / Choosing the DataNode hardware
    • MapReduce temporary files and system data, factoring in / Choosing the DataNode hardware
    • low storage density cluster / Low storage density cluster
    • high storage density cluster / High storage density cluster
    • NameNode hardware / The NameNode hardware
    • JobTracker hardware / The JobTracker hardware
    • Gateway servers / Gateway and other auxiliary services
    • network, considerations / Network considerations
    • OS, selecting for / Choosing OS for the Hadoop cluster
    • OS, configuring for / Configuring OS for Hadoop cluster
    • monitoring strategy / Monitoring strategy overview
  • Hadoop cluster hardware
    • selecting / Choosing Hadoop cluster hardware
  • Hadoop Distributed File System (HDFS) / Setting up NameNode
  • Hadoop ecosystem
    • hosting / Hosting the Hadoop ecosystem
  • hadoop jar command / TaskTracker configuration
  • hadoop package / Setting up NameNode
  • Hadoop tuning
    • hdfs-site.xml / hdfs-site.xml
    • mapred-site.xml / mapred-site.xml
    • core-site.xml / core-site.xml
  • Hadoop version
    • selecting / Choosing the Hadoop version
  • HDFS
    • security / HDFS security
    • Kerberos, enabling for / Enabling Kerberos for HDFS
    • monitoring / Monitoring HDFS
  • hdfs-site.xml file / Hadoop configuration files, hdfs-site.xml
  • hdfs balancer command / hdfs-site.xml
  • hdfs command-line client tool / DataNode configuration
  • high storage density cluster
    • about / High storage density cluster
  • Hive
    • about / Hive
    • architecture / Hive architecture, Installing Hive Metastore
    • Metastore, installing / Installing Hive Metastore
    • client, installing / Installing the Hive client
    • Server, installing / Installing Hive Server
  • HiveQL / Hive
  • Hortonworks Hadooop distribution
    • about / Hortonworks Hadoop distribution

I

  • Impala
    • about / Impala
    • architecture / Impala architecture
    • state store, installing / Installing Impala state store
    • server, installing / Installing the Impala server
    • server, starting / Installing the Impala server
    • using, in command line / Installing the Impala server
    • server, connecting to / Installing the Impala server
  • import command / Sqoop export example

J

  • Java versions, Hadoop
    • URL / Setting up Java Development Kit
  • JBOD (Just a Bunch of Disks) / Choosing the DataNode hardware
  • JMX metrics
    • about / JMX Metrics
  • JobQueueTaskScheduler
    • about / JobQueueTaskScheduler
  • job scheduler
    • configuring / Configuring the job scheduler
  • JobTracker
    • hardware / The JobTracker hardware
    • package, installing / JobTracker configuration
  • JobTracker checks
    • host-level checks / JobTracker checks
    • service-level checks / JobTracker checks
  • JobTracker configuration
    • about / JobTracker configuration
    • job scheduler, configuring / Configuring the job scheduler
    • FairScheduler / FairScheduler
    • CapacityTaskScheduler / CapacityTaskScheduler
  • JournalNode
    • about / Setting up NameNode
    • installing, on server / JournalNode, ZooKeeper, and Failover Controller
  • JournalNode checks
    • host-level resources / JournalNode checks

K

  • Kerberos
    • about / Hadoop security overview, Kerberos overview
    • and Hadoop / Hadoop and Kerberos
    • principal, example / Kerberos overview
    • in Hadoop / Kerberos in Hadoop
    • clients, configuring / Configuring Kerberos clients
    • principals, generating / Generating Kerberos principals
    • enabling, for HDFS / Enabling Kerberos for HDFS
    • enabling, for MapReduce / Enabling Kerberos for MapReduce
  • Key Distribution Center (KDC) / Kerberos overview

L

  • Linux Alternatives
    • URL / Hadoop configuration files
  • log4j.properties configuration file / Hadoop configuration files
  • low storage density cluster
    • about / Low storage density cluster

M

  • -m 0 option / Choosing and setting up the filesystem
  • manual failover option / Setting up NameNode
  • MapR
    • about / MapR
  • mapred-site.xml file / Hadoop configuration files, mapred-site.xml
  • mapred.java.child.opts variable / TaskTracker configuration
  • mapred.job.racker.handler.count variable / mapred-site.xml
  • mapred.local.dir directory / JobTracker configuration
  • MapReduce
    • about / JobTracker configuration
    • security / MapReduce security
    • Kerberos, enabling for / Enabling Kerberos for MapReduce
    • monitoring / Monitoring MapReduce
  • MasterPublicDnsName field / Launching the EMR cluster
  • Metastore, Hive
    • installing / Installing Hive Metastore
    • starting / Installing Hive Metastore
  • Metastore service / Hive architecture
  • Metrics2 / Hadoop Metrics
  • MissingBlocks status variable / NameNode checks
  • mntr command / ZooKeeper checks
  • MySQL
    • table, importing to Hadoop / Sqoop import example
  • MySQL JDBC driver
    • URL, for downloading / Installing and configuring Sqoop

N

  • --num-mappers option / Sqoop import example
  • Nagios
    • Hadoop, monitoring with / Monitoring Hadoop with Nagios
    • documentation, URL / Monitoring Hadoop with Nagios
  • NameNode
    • hardware / The NameNode hardware
    • setting up / Setting up NameNode
    • manual failover option / Setting up NameNode
    • automatic failover option / Setting up NameNode
    • HA configuration / NameNode HA configuration
    • about / Kerberos in Hadoop
  • NameNode checks
    • about / NameNode checks
  • NumDeadDataNodes status variables / NameNode checks

O

  • -O extent,sparse_super,flex_bg option / Choosing and setting up the filesystem
  • Oracle JDK
    • URL, for downloading / Setting up Java Development Kit
  • OS
    • selecting, for Hadoop cluster / Choosing OS for the Hadoop cluster
  • OS configuration
    • for Hadoop cluster / Configuring OS for Hadoop cluster
    • filesystem, setting up / Choosing and setting up the filesystem
    • filesystem, selecting / Choosing and setting up the filesystem
    • Java Development Kit, setting up / Setting up Java Development Kit
    • other settings / Other OS settings
    • CDH repositories, setting up / Setting up the CDH repositories

P

  • principal, components
    • primary component / Kerberos overview
    • secondary component / Kerberos overview
    • realm component / Kerberos overview
  • principals, Kerberos
    • generating / Generating Kerberos principals

Q

  • queue administrator
    • about / MapReduce security
  • quorum / Monitoring strategy overview
  • Quorum Journal Manager / Setting up NameNode

R

  • RAID / Choosing the DataNode hardware
  • repository
    • adding / Setting up the CDH repositories

S

  • S3 documentation
    • URL / Preparing input and output locations
  • SELECT statement / Sqoop export example
  • server, Hive
    • installing / Installing Hive Server
  • server, Impala
    • installing / Installing the Impala server
  • service level authorization
    • about / Hadoop Service Level Authorization
  • shell. sshfence / NameNode HA configuration
  • sink / Hadoop Metrics
  • slaves file / Hadoop configuration files
  • source / Hadoop Metrics
  • sparse_super option / Choosing and setting up the filesystem
  • split-brain / NameNode HA configuration
  • Sqoop
    • about / Sqoop
    • installing / Installing and configuring Sqoop
    • configuring / Installing and configuring Sqoop
    • installing, CDH repository used / Installing and configuring Sqoop
    • import, example / Sqoop import example
    • export, example / Sqoop export example
  • sshfence / NameNode HA configuration
  • State field / Launching the EMR cluster
  • state store, Impala
    • about / Impala architecture
    • installing / Installing Impala state store

T

  • TaskTracker
    • configuring / TaskTracker configuration
  • Ticket-Granting Service (TGS) / Kerberos overview
  • Ticket-Granting Ticket (TGT) / Kerberos overview

V

  • -version option / Installing the EMR command-line interface

W

  • --warehouse-dir option / Sqoop import example
  • Whirr
    • about / Using Whirr
    • installing / Installing and configuring Whirr
    • configuration files / Installing and configuring Whirr

Y

  • yum command / Setting up NameNode
  • yum package
    • setting up / Setting up the CDH repositories

Z

  • ZooKeeper
    • about / Setting up NameNode, JournalNode, ZooKeeper, and Failover Controller
    • service, starting / JournalNode, ZooKeeper, and Failover Controller
  • Zookeeper checks
    • about / ZooKeeper checks
  • zookeeper package / Setting up NameNode
lock icon The rest of the chapter is locked
arrow left Previous Section
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at AU $24.99/month. Cancel anytime