Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Apache Hadoop 3 Quick Start Guide Learn about big data processing and analytics

Product type Paperback

Published in Oct 2018

Publisher Packt

ISBN-13 9781788999830

Length 220 pages

Edition 1st Edition

Languages

Java

Tools

Hadoop

Concepts

Big Data

Author (1):

Hrishikesh Vijay Karambelkar

View More author details

Table of Contents (10) Chapters

Preface

1. Hadoop 3.0 - Background and Introduction

2. Planning and Setting Up Hadoop Clusters FREE CHAPTER

3. Deep Dive into the Hadoop Distributed File System

4. Developing MapReduce Applications

5. Building Rich YARN Applications

6. Monitoring and Administration of a Hadoop Cluster

7. Demystifying Hadoop Ecosystem Components

8. Advanced Topics in Apache Hadoop

9. Other Books You May Enjoy

Leave a review - let other readers know what you think

Data analytics with Apache Spark

Apache Spark offers a blazing fast processing engine based out of Apache Hadoop. It provides in-memory cluster processing of the data, thereby providing analytics at high speeds. Apache Spark evolved in AMPLab (U. C. Berkeley) in 2009 and it was made open source through the Apache Software Foundation. Apache Spark is based out of YARN. Following are key features of Apache Spark:

Fast: Due to in-memory processing capability, Spark is fast in processing
Multiple language support: You can write Spark programs in Java, Scala, R, and Python
Deep analytics: It provides truly distributed analytics, which includes machine learning, streaming data processing, and data querying
Rich API support: It provides a rich API library for interaction in multiple languages
Multi-processing engine support: Apache Spark can be deployed on MapReduce, YARN, and Mesos...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €18.99/month. Cancel anytime

Authors (1)

Vijay Karambelkar

Hrishikesh Vijay Karambelkar is an innovator and an enterprise architect with 16 years of software design and development experience, specifically in the areas of big data, enterprise search, data analytics, text mining, and databases. He is passionate about architecting new software implementations for the next generation of software solutions for various industries, including oil and gas, chemicals, manufacturing, utilities, healthcare, and government infrastructure. In the past, he has authored three books for Packt Publishing: two editions of Scaling Big Data with Hadoop and Solr and one of Scaling Apache Solr. He has also worked with graph databases, and some of his work has been published at international conferences such as VLDB and ICDE.

See other products by Vijay Karambelkar