0

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Free Learning

Cloudera Administration Handbook

You're reading from Cloudera Administration Handbook A complete, hands-on guide to building and maintaining large Apache Hadoop clusters using Cloudera Manager and CDH5

Product type Paperback

Published in Jul 2014

Publisher Packt

ISBN-13 9781783558964

Length 254 pages

Edition 1st Edition

Tools

CentOS

Concepts

Cloud Computing

Author (1):

Rohit Menon

View More author details

Table of Contents (11) Chapters

Preface

1. Getting Started with Apache Hadoop

2. HDFS and MapReduce FREE CHAPTER

3. Cloudera's Distribution Including Apache Hadoop

4. Exploring HDFS Federation and Its High Availability

5. Using Cloudera Manager

6. Implementing Security Using Kerberos

7. Managing an Apache Hadoop Cluster

8. Cluster Monitoring Using Events and Alerts

9. Configuring Backups

Index

Components of Apache Hadoop

Apache Hadoop is composed of two core components. They are:

HDFS: The HDFS is responsible for the storage of files. It is the storage component of Apache Hadoop, which was designed and developed to handle large files efficiently. It is a distributed filesystem designed to work on a cluster and makes it easy to store large files by splitting the files into blocks and distributing them across multiple nodes redundantly. The users of HDFS need not worry about the underlying networking aspects, as HDFS takes care of it. HDFS is written in Java and is a filesystem that runs within the user space.
MapReduce: MapReduce is a programming model that was built from models found in the field of functional programming and distributed computing. In MapReduce, the task is broken down to two parts: map and reduce. All data in MapReduce flows in the form of key and value pairs, <key, value>. Mappers emit key and value pairs and the reducers receive them, work on them, and produce the final result. This model was specifically built to query/process the large volumes of data stored in HDFS.

We will be going through HDFS and MapReduce in depth in the next chapter.

The rest of the chapter is locked

Register for a free Packt account to unlock a world of extra content!

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (1)

Menon

Menon

Rohit Menon is a senior system analyst living in Denver, Colorado. He has over 7 years of experience in the field of Information Technology, which started with the role of a real-time applications developer back in 2006. He now works for a product-based company specializing in software for large telecom operators. He graduated with a master's degree in Computer Applications from University of Pune, where he built an autonomous maze-solving robot as his final year project. He later joined a software consulting company in India where he worked on C#, SQL Server, C++, and RTOS to provide software solutions to reputable organizations in USA and Japan. After this, he started working for a product-based company where most of his time was dedicated to programming the finer details of products using C++, Oracle, Linux, and Java. He is a person who always likes to learn new technologies and this got him interested in web application development. He picked up Ruby, Ruby on Rails, HTML, JavaScript, CSS, and built www.flicksery.com, a Netflix search engine that makes searching for titles on Netflix much easier. On the Hadoop front, he is a Cloudera Certified Apache Hadoop Developer. He blogs at www.rohitmenon.com, mainly on topics related to Apache Hadoop and its components. To share his learning, he has also started www.hadoopscreencasts.com, a website that teaches Apache Hadoop using simple, short, and easy-to-follow screencasts. He is well versed with wide variety of tools and techniques such as MapReduce, Hive, Pig, Sqoop, Oozie, and Talend Open Studio.

See other products by Menon

Personalised recommendations for you

Based on your interests and search pattern

Mastering PowerShell Scripting

Mastering PowerShell Scripting

PowerShell scripts provides a convenient method for automating tasks, using them proficiently can be challenging. This all-inclusive guide begins at the basics and covers advanced concepts, equipping you with tips to become an expert in PowerShell Core 7.3 scripting.

May 2024 27h 32m

Network Automation with Nautobot

Network Automation with Nautobot

This book will help you understand why a network source of truth is needed for long-term network automation success, which will in turn save you hundreds of hours in deploying and integrating Nautobot into network automation.

May 2024 27h 12m

NGINX HTTP Server

NGINX HTTP Server

Explore the power of NGINX with this guide covering an array of essential practical topics, including securing your infrastructure with automatic TLS certificates, placing NGINX in front of your existing applications, and much more.

May 2024 8h 44m

Mastering Azure Virtual Desktop

Mastering Azure Virtual Desktop

This updated edition will help you plan an Azure Virtual Desktop Architecture, implement its infrastructure, and manage its access and security. With content aligned with the exam objectives, it'll help you ace the Microsoft AZ-140 exam.

Jul 2024 23h 56m

Learn Ansible will teach you how to write Ansible Playbooks for deploying simple apps. This updated edition covers the latest Ansible features, helping you confidently implement Ansible in your daily workflows.

May 2024 13h 48m

HashiCorp Terraform Associate (003) Exam Guide

HashiCorp Terraform Associate (003) Exam Guide

This book will help you explore HashiCorp Terraform and prepare for Associate (003) certification, from understanding core concepts to advanced modules. You'll gain hands-on expertise, troubleshoot with confidence, and more.

May 2024 11h 28m

Kubernetes – An Enterprise Guide

Kubernetes – An Enterprise Guide

Navigate the complexities of Kubernetes and fully leverage its capabilities for enterprise applications. This edition dives into advanced deployments, groundbreaking techniques, and insights that will elevate your skills and redefine your expertise.

Aug 2024 22h 44m

Atlassian DevOps Toolchain Cookbook

Atlassian DevOps Toolchain Cookbook

Master setting up a DevOps toolchain using Atlassian tools and Open DevOps as a framework with this recipe-driven guide to automated testing, integration, deployment, observability, and incident management for streamlining development processes.

Jul 2024 16h 48m

AWS Certified Developer Associate Certification and Beyond

AWS Certified Developer Associate Certification and Beyond

This is your guide to passing the challenging AWS Certified Developer – Associate certification exam and setting yourself up for a rewarding career. Through a sample project, it explains how to design, architect, and implement applications on AWS.

Jul 2024 23h 40m

Implementing GitOps with Kubernetes

Implementing GitOps with Kubernetes

This book provides step-by-step tutorials and hands-on examples for effectively implementing GitOps practices in your Kubernetes deployments. You'll learn how to automate, monitor, and secure your infrastructure for efficient application delivery.

Aug 2024 14h 48m