Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Apache Hadoop 3 Quick Start Guide Learn about big data processing and analytics

Product type Paperback

Published in Oct 2018

Publisher Packt

ISBN-13 9781788999830

Length 220 pages

Edition 1st Edition

Languages

Java

Tools

Hadoop

Concepts

Big Data

Author (1):

Hrishikesh Vijay Karambelkar

View More author details

Table of Contents (10) Chapters

Preface

1. Hadoop 3.0 - Background and Introduction

2. Planning and Setting Up Hadoop Clusters FREE CHAPTER

3. Deep Dive into the Hadoop Distributed File System

4. Developing MapReduce Applications

5. Building Rich YARN Applications

6. Monitoring and Administration of a Hadoop Cluster

7. Demystifying Hadoop Ecosystem Components

8. Advanced Topics in Apache Hadoop

9. Other Books You May Enjoy

Leave a review - let other readers know what you think

Planning and sizing clusters

Once you start working on problems and implementing Hadoop clusters, you'll have to deal with the issue of sizing. It's not just the sizing aspect of clusters that needs to be considered, but the SLAs associated with Hadoop runtime as well. A cluster can be categorized based on workloads as follows:

Lightweight: This category is intended for low computation and fewer storage requirements, and is more useful for defined datasets with no growth
Balanced: A balanced cluster can have storage and computation requirements that grow over time
Storage-centric: This category is more focused towards storing data, and less towards computation; it is mostly used for archival purposes, as well as minimal processing
Computational-centric: This cluster is intended for high computation which requires CPU or GPU-intensive work, such as analytics, prediction...

You have been reading a chapter from

Apache Hadoop 3 Quick Start Guide

Published in: Oct 2018

Publisher: Packt

ISBN-13: 9781788999830

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (1)

Vijay Karambelkar

Hrishikesh Vijay Karambelkar is an innovator and an enterprise architect with 16 years of software design and development experience, specifically in the areas of big data, enterprise search, data analytics, text mining, and databases. He is passionate about architecting new software implementations for the next generation of software solutions for various industries, including oil and gas, chemicals, manufacturing, utilities, healthcare, and government infrastructure. In the past, he has authored three books for Packt Publishing: two editions of Scaling Big Data with Hadoop and Solr and one of Scaling Apache Solr. He has also worked with graph databases, and some of his work has been published at international conferences such as VLDB and ICDE.

See other products by Vijay Karambelkar