Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Modern Big Data Processing with Hadoop Expert techniques for architecting end-to-end big data solutions to get valuable insights

Product type Paperback

Published in Mar 2018

Publisher Packt

ISBN-13 9781787122765

Length 394 pages

Edition 1st Edition

Languages

Processing

Tools

Apache Spark

Concepts

Big Data

Authors (3):

Manoj R Patil

Prashant Shindgikar

V Naresh Kumar

View More author details

Table of Contents (12) Chapters

Preface

1. Enterprise Data Architecture Principles

2. Hadoop Life Cycle Management FREE CHAPTER

3. Hadoop Design Consideration

4. Data Movement Techniques

5. Data Modeling in Hadoop

6. Designing Real-Time Streaming Data Pipelines

7. Large-Scale Data Processing Frameworks

8. Building Enterprise Search Platform

9. Designing Data Visualization Solutions

10. Developing Applications Using the Cloud

11. Production Hadoop Cluster Deployment

Best practices Hadoop deployment

Following are some best practices to be followed for Hadoop deployment:

Start small: Like other software projects, an implementation Hadoop also involves risks and cost. It's always better to set up a small Hadoop cluster of four nodes. This small cluster can be set up as proof of concept (POC). Before using any Hadoop component, it can be added to the existing Hadoop POC cluster as proof of technology (POT). It allows the infrastructure and development team to understand big data project requirements. After successful completion of POC and POT, additional nodes can be added to the existing cluster.
Hadoop cluster monitoring: Proper monitoring of the NameNode and all DataNodes is required to understand the health of the cluster. It helps to take corrective actions in the event of node problems. If a service goes down, timely action can help...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (3)

Shindgikar

Prashant Shindgikar is an accomplished big data Architect with over 20 years of experience in data analytics. He specializes in data innovation and resolving data challenges for major retail brands. He is a hands-on architect having an innovative approach to solving data problems. He provides thought leadership and pursues strategies for engagements with the senior executives on innovation in data processing and analytics. He presently works for a large USA-based retail company.

See other products by Shindgikar

R Patil

Manoj R Patil is the Chief Architect in Big Data at Compassites Software Solutions Pvt. Ltd. where he overlooks the overall platform architecture related to Big Data solutions, and he also has a hands-on contribution to some assignments. He has been working in the IT industry for the last 15 years. He started as a programmer and, on the way, acquired skills in architecting and designing solutions, managing projects keeping each stakeholder's interest in mind, and deploying and maintaining the solution on a cloud infrastructure. He has been working on the Pentaho-related stack for the last 5 years, providing solutions while working with employers and as a freelancer as well. Manoj has extensive experience in JavaEE, MySQL, various frameworks, and Business Intelligence, and is keen to pursue his interest in predictive analysis. He was also associated with TalentBeat, Inc. and Persistent Systems, and implemented interesting solutions in logistics, data masking, and data-intensive life sciences.

See other products by R Patil

Kumar

Ashish Kumar is a seasoned data science professional, a publisher author and a thought leader in the field of data science and machine learning. An IIT Madras graduate and a Young India Fellow, he has around 7 years of experience in implementing and deploying data science and machine learning solutions for challenging industry problems in both hands-on and leadership roles. Natural Language Procession, IoT Analytics, R Shiny product development, Ensemble ML methods etc. are his core areas of expertise. He is fluent in Python and R and teaches a popular ML course at Simplilearn. When not crunching data, Ashish sneaks off to the next hip beach around and enjoys the company of his Kindle. He also trains and mentors data science aspirants and fledgling start-ups.

See other products by Kumar