Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Data Lake for Enterprises Lambda Architecture for building enterprise data systems

Product type Paperback

Published in May 2017

Publisher Packt

ISBN-13 9781787281349

Length 596 pages

Edition 1st Edition

Languages

Java

Tools

Hadoop

Concepts

Data Processing

Authors (3):

Pankaj Misra

Tomcy John

Vivek Mishra

View More author details

Table of Contents (13) Chapters

Preface

1. Introduction to Data FREE CHAPTER

2. Comprehensive Concepts of a Data Lake

3. Lambda Architecture as a Pattern for Data Lake

4. Applied Lambda for Data Lake

5. Data Acquisition of Batch Data using Apache Sqoop

6. Data Acquisition of Stream Data using Apache Flume

7. Messaging Layer using Apache Kafka

8. Data Processing using Apache Flink

9. Data Store Using Apache Hadoop

10. Indexed Data Store using Elasticsearch

11. Data Lake Components Working Together

12. Data Lake Use Case Suggestions

Other Hadoop Processing Options

Apache Hadoop is something that will always pop up whenever a big data term is used. It has almost become a mandatory piece when dealing with Big Data. There is no doubt that Hadoop is an excellent choice, but it does have some inherent aspects that put a doubt in developers' minds when the choice has to be made, especially when big data and its processing is ever increasing in any enterprise, obviously due to changing business dynamics. Some of its pointed disadvantages are Hadoop's complexity and the way it actually does execution. Due to these reasons, there have been some recent innovations to simplify Hadoop processing further, and some of these simplifications have been brought in by the advent of Pig scripts and Apache Spark.

Pig scripts provide a good alternate to simplify MapReduce activity with pig Latin language, while still enabling non-Java developers...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (3)

John

Tomcy John lives in Dubai (United Arab Emirates), hailing from Kerala (India), and is an enterprise Java specialist with a degree in Engineering (B Tech) and over 14 years of experience in several industries. He's currently working as principal architect at Emirates Group IT, in their core architecture team. Prior to this, he worked with Oracle Corporation and Ernst & Young. His main specialization is in building enterprise-grade applications and he acts as chief mentor and evangelist to facilitate incorporating new technologies as corporate standards in the organization. Outside of his work, Tomcy works very closely with young developers and engineers as mentors and speaks at various forums as a technical evangelist on many topics ranging from web and middleware all the way to various persistence stores.

See other products by John

Mishra

Charit Mishra is an ICS/SCADA security professional. He works as a security architect for critical infrastructure industry (oil and gas, energy and utility, transport, telecom, and so on) and holds extensive experience in security standards, framework, and technologies, with real hands-on experience in security. He has obtained leading industry certifications, such as OSCP, CEH, CompTIA Security+, and CCNA R&S. Also, he holds a master's degree in computer science. He regularly delivers professional trainings on critical infrastructure security internationally.

See other products by Mishra

Pankaj Misra

Pankaj Misra has been a technology evangelist, holding a bachelor's degree in engineering, with over 16 years of experience across multiple business domains and technologies. He has been working with Emirates Group IT since 2015, and has worked with various other organizations in the past. He specializes in architecting and building multi-stack solutions and implementations. He has also been a speaker at technology forums in India and has built products with scale-out architecture that support high-volume, near-real-time data processing and near-real-time analytics.

See other products by Pankaj Misra