You're reading from Scalable Data Architecture with Java Build efficient enterprise-grade data architecting solutions using Java

Product type Paperback

Published in Sep 2022

Publisher Packt

ISBN-13 9781801073080

Length 382 pages

Edition 1st Edition

Languages

Java

Tools

Deeplearning4j

Concepts

Data Science

Author (1):

Sinchan Banerjee

View More author details

Table of Contents (19) Chapters

Preface

1. Section 1 – Foundation of Data Systems

2. Chapter 1: Basics of Modern Data Architecture FREE CHAPTER

3. Chapter 2: Data Storage and Databases

4. Chapter 3: Identifying the Right Data Platform

5. Section 2 – Building Data Processing Pipelines

6. Chapter 4: ETL Data Load – A Batch-Based Solution to Ingesting Data in a Data Warehouse

7. Chapter 5: Architecting a Batch Processing Pipeline

8. Chapter 6: Architecting a Real-Time Processing Pipeline

9. Chapter 7: Core Architectural Design Patterns

10. Chapter 8: Enabling Data Security and Governance

11. Section 3 – Enabling Data as a Service

12. Chapter 9: Exposing MongoDB Data as a Service

13. Chapter 10: Federated and Scalable DaaS with GraphQL

14. Section 4 – Choosing Suitable Data Architecture

15. Chapter 11: Measuring Performance and Benchmarking Your Applications

16. Chapter 12: Evaluating, Recommending, and Presenting Your Solutions

17. Index

Why subscribe?

18. Other Books You May Enjoy

Summary

In this chapter, we learned how to analyze a problem and identified that it was a big data problem. We also learned how to choose a platform and technology that will be performance-savvy, optimized, and cost-effective. We learned how to use all these factors judiciously to develop a big data batch processing solution in the cloud. Then, we learned how to analyze, profile, and draw inferences from big data files using AWS Glue DataBrew. After that, we learned how to develop, deploy, and run a Spark Java application in the AWS cloud to process a huge volume of data and store it in an ODL. We also discussed how to write an AWS Lambda trigger function in Java to automate the Spark jobs. Finally, we learned how to expose the processed ODL data through an AWS Athena table so that downstream systems can easily query and use the ODL data.

Now that we have learned how to develop optimized and cost-effective batch-based data processing solutions for different kinds of data volumes...

The rest of the chapter is locked

You're reading from Scalable Data Architecture with Java Build efficient enterprise-grade data architecting solutions using Java

Table of Contents (19) Chapters

Summary

Authors (1)

Personalised recommendations for you

You're reading from Scalable Data Architecture with Java Build efficient enterprise-grade data architecting solutions using Java

Table of Contents (19) Chapters

Summary

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you