You're reading from Scalable Data Architecture with Java Build efficient enterprise-grade data architecting solutions using Java

Product type Paperback

Published in Sep 2022

Publisher Packt

ISBN-13 9781801073080

Length 382 pages

Edition 1st Edition

Languages

Java

Tools

Deeplearning4j

Concepts

Data Science

Author (1):

Sinchan Banerjee

View More author details

Table of Contents (19) Chapters

Preface

1. Section 1 – Foundation of Data Systems

2. Chapter 1: Basics of Modern Data Architecture FREE CHAPTER

3. Chapter 2: Data Storage and Databases

4. Chapter 3: Identifying the Right Data Platform

5. Section 2 – Building Data Processing Pipelines

6. Chapter 4: ETL Data Load – A Batch-Based Solution to Ingesting Data in a Data Warehouse

7. Chapter 5: Architecting a Batch Processing Pipeline

8. Chapter 6: Architecting a Real-Time Processing Pipeline

9. Chapter 7: Core Architectural Design Patterns

10. Chapter 8: Enabling Data Security and Governance

11. Section 3 – Enabling Data as a Service

12. Chapter 9: Exposing MongoDB Data as a Service

13. Chapter 10: Federated and Scalable DaaS with GraphQL

14. Section 4 – Choosing Suitable Data Architecture

15. Chapter 11: Measuring Performance and Benchmarking Your Applications

16. Chapter 12: Evaluating, Recommending, and Presenting Your Solutions

17. Index

Why subscribe?

18. Other Books You May Enjoy

Hadoop platforms

With the advent of search engines, social networks, and online marketplaces, data volumes grew exponentially. Searching and processing such data volumes needed a different approach to meet the service-level agreements (SLAs) and customer expectations. Both Google and Nutch used a new technology paradigm to solve this problem, thus storing and processing data in a distributed way automatically. As a result of this approach, Hadoop was born in 2008 and has proved to be a lifesaver for storing and processing huge volumes (in the order of terabytes or more) of data efficiently and quickly.

Apache Hadoop is an open source framework that enables distributed storage and processing of large datasets across a cluster of computers. It is designed to scale from a single server to thousands of machines easily. It provides high availability by having strong node failover and recovery features, which enables a Hadoop cluster to run on cheap commodity hardware.

Hadoop architecture...

The rest of the chapter is locked

You're reading from Scalable Data Architecture with Java Build efficient enterprise-grade data architecting solutions using Java

Table of Contents (19) Chapters

Hadoop platforms

Hadoop architecture...

Authors (1)

Personalised recommendations for you

You're reading from Scalable Data Architecture with Java Build efficient enterprise-grade data architecting solutions using Java

Table of Contents (19) Chapters

Hadoop platforms

Hadoop architecture...

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you