Cloudera Administration Handbook: A complete, hands-on guide to building and maintaining large Apache Hadoop clusters using Cloudera Manager and CDH5

Menon

$36.99

3.5 (10 Ratings)

eBook Jul 2014 254 pages 1st Edition

Menon

$36.99

3.5 (10 Ratings)

eBook Jul 2014 254 pages 1st Edition

What do you get with eBook?

Instant access to your Digital eBook purchase

Download this book in EPUB and PDF formats

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

View table of contents

Preview Book

Download Code

Description

An easy-to-follow Apache Hadoop administrator’s guide filled with practical screenshots and explanations for each step and configuration. This book is great for administrators interested in setting up and managing a large Hadoop cluster. If you are an administrator, or want to be an administrator, and you are ready to build and maintain a production-level cluster running CDH5, then this book is for you.

What you will learn

Understand the Apache Hadoop architecture and the future of distributed processing frameworks
Use HDFS and MapReduce for all filerelated operations
Install and configure CDH to bring up an Apache Hadoop cluster
Configure HDFS High Availability and HDFS Federation to prevent single points of failure
Install and configure Cloudera Manager to perform administrator operations
Implement security by installing and configuring Kerberos for all services in the cluster
Add, remove, and rebalance nodes in a cluster using cluster management tools
Understand and configure the different backup options to back up your HDFS

What do you get with eBook?

Instant access to your Digital eBook purchase

Download this book in EPUB and PDF formats

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

Frequently bought together

$38.99

$54.99

$60.99

Total $ 154.97

william El Kaim Sep 22, 2014

The Cloudera Administration Handbook written by Rohit Menon is a fantastic resource for anybody wanting to understand and manage a Cloudera platform.I have to admit that I’m a rookie and that this book was exactly what I was dreaming of. Having all information in the same place, and code example both for Linux and Windows.The book is mainly targeted at bid data expert and system administrator. The first three chapters are giving the minimum background to understand MapReduce, Hadoop and Yarn and the Cloudera's Distribution Including Apache Hadoop (all services are listed and explained).Then, you enter into the “hard part”. Chapter 4 discussing in details HDFS Federation and Its High Availability and chapter 7 describing “Managing an Apache Hadoop Cluster” were for me particularly valuable. The chapter 5 presenting Cloudera Manager, a web-browser-based administration tool to manage Apache Hadoop clusters, will show you how to manage the clusters with point and clicks instead of command lines. Chapter 6 is about configuring access and right using the Kerberos services. It does show you how to implement the security services, but not how to manage user rights, which is a step requiring some planning. Monitoring and backup (using the Hadoop utility DistCp and the Cloudera manager). are also presented in two distinct parts.What I like in this book is that it goes directly to the point, assuming you already know the basics of system administration and distributed architecture. It then shares many “tips” that only an experienced professional will know, and enables the rookie I was to avoid mistakes. With this book, you will gain time. For example, the author told you when a SPOF (single point of failure) exist and the solutions to avoid them.The only part of the book that was missing for me was the cloud deployment. I would have liked a chapter explaining how to setup Cloudera in the cloud, and get the code (puppet or chef) to automate the install.It is clearly a worth buying book for people wanting to setup and deploy correctly a Cloudera platform. I also like the fact that for the same price you can download the PDF, mobi, epub and kindle version.

Amazon Verified review

A. Zubarev Sep 25, 2014

Cloudera Administration Handbook is just another great what I call 'desk companion' book, especially a must for a beginner Cloudera Administrator.Written in a well balanced volume of material to feature coverage ratio, by a person from "the trenches" Rohit expands exactly on what a Hadoop Admin needs and should be using in retrospect to the Cloudera offerings in this area of expertize to successfully accomplish ones day-to-day tasks.However, it is actually a lot more than just an admin's book, it also teaches how to install most of the Cloudera Hadoop ecosystem components, what components are typically in use by what in a business and how to configure each. That all is done in a thorough, precise and professional manner without any extra fuss or foofaraw.I liked that the author expanded briefly, but nicely on the new features in Hadoop 2.0. For me the coverage on Map-Reduce appeared the most valuable. I admit it is a rough area of Hadoop.The troubleshooting part must be the one to read on and re-read, but also high availability, backup, balancing, and security. Especially the Kerberos setup, I deem it a very necessary, yet rarely covered topic, that also appears very hard to understand, may be at least to me, but it was worth going through that very much. Overall, as an aside, CDH distribution is very extensive and feature rich no wonder a whole book can be dedicated to just this topic. The Cloudera Manager now after reading the book I must say is an awesome tool to have on board, it is just a great helper, but it requires a good book as Cloudera Administration Handbook by Rohit Menon to get acquitted with.Have it beside you, at your desk.

Amazon Verified review

Robert Rapplean Sep 12, 2014

This book provides an excellent overview of how to use the Cloudera Manager to build and maintain an industrial Hadoop cluster. I think its best audience is either someone who is already familiar with Hadoop and will need to start managing a Cloudera cluster, or someone who will mostly just be interacting with the Cloudera Manager interface while a primary system administrator handles the more complicated issues that revolve around unpredictable variations. It starts off with a relatively watered-down overview of the concepts behind Hadoop, but around Chapter 5 it really picks up and provides a great description of how to build and manage your cluster. The techniques provided in this book make use of Cloudera Manager wherever possible, as this is the preferred method of setting up and maintaining a Cloudera Hadoop cluster.One of the strongest points is that it contributes to the amount of published knowledge around Hadoop 2, which has been slow to catch up with the release of the technology.There are a few shortcomings that prevent me from giving it a full five stars. Since it focuses on the Cloudera Manager, it could leave a fledgeling admin in a bad place if things aren't all lined up just right. The education base of the target audience is a little narrow since its tone is aimed at informing, not teaching, so it excludes those who are not familiar with system administration and general. In the other direction, it doesn't provide the in-depth, "under the hood" details that heavy-weight system administrators enjoy wading through (but which would require a much thicker book).

Amazon Verified review

Si Dunn Sep 17, 2014

This is a well-organized, well-written and solidly illustrated guide to building and maintaining large Apache Hadoop clusters using Cloudera Manager and CDH5. You can't get everything you need to know in one book, no matter how big it is. And I would have preferred seeing a bit more information about the Cloudera Manager nearer to the front (while not giving up the learn-to-use-the-command-line-first approach). But this book definitely can get you going, and it can show you a lot of what you will need to know as an administrator of Hadoop clusters. I have read and used a few other Hadoop how-to books in recent years. This one will be a keeper. (Thanks to Packt Publishing for providing a review copy).

Amazon Verified review

Ganapathy Kokkeshwara Oct 09, 2014

This is a must read book for anyone who is in the process of learning the administration of Cloudera distribution of hadoop. Other than learning, the book can also be used for reference. In the first 2 chapters , the author talks about Apache Hadoop, its various components, HDFS and Mapreduce. These chapters provides a very informative introduction to Apache Hadoop and the ecosystem associated with it. These chapters are useful for anyone interested in learning hadoop technology let alone for Cloudera administrators.In the 3rd chapter author talks about Cloudera distribution of Apache Hadoop and the other components that are distributed with Cloudera Hadoop Distribution ( CDH) such as Flume ,Sqoop, Pig, Hive, Zookeeper etc. The components are explained in simple terms that can be understood by most of the technical persons. I liked the 'screenshots' of the UI for each of the components that made it little bit easier to understand and comprehend. This chapter also covers the installation of CDH and the various components. I liked the fact that author covers the installation from Cloudera Manager as well as from Operating System’s package manager , thus providing more options for the administrator.Rest of the chapters cover administrating the high availability , implementing security using Kerberos , managing cluster and monitoring. Chapter 9 talks about backing up the Hadoop cluster. I liked the fact that author took time to explain the various types of backup and storage media for backups before actually getting into technical nitty-gritty of back up of big data.Throughout the book , author also writes briefly about the impact on administering the CDH when deployed on cloud and provides the relevant web links for further reference.

Amazon Verified review

Cloudera Administration Handbook: A complete, hands-on guide to building and maintaining large Apache Hadoop clusters using Cloudera Manager and CDH5

What do you get with eBook?