You're reading from Scalable Data Architecture with Java Build efficient enterprise-grade data architecting solutions using Java

Product type Paperback

Published in Sep 2022

Publisher Packt

ISBN-13 9781801073080

Length 382 pages

Edition 1st Edition

Languages

Java

Tools

Deeplearning4j

Concepts

Data Science

Author (1):

Sinchan Banerjee

View More author details

Table of Contents (19) Chapters

Preface

1. Section 1 – Foundation of Data Systems

2. Chapter 1: Basics of Modern Data Architecture FREE CHAPTER

3. Chapter 2: Data Storage and Databases

4. Chapter 3: Identifying the Right Data Platform

5. Section 2 – Building Data Processing Pipelines

6. Chapter 4: ETL Data Load – A Batch-Based Solution to Ingesting Data in a Data Warehouse

7. Chapter 5: Architecting a Batch Processing Pipeline

8. Chapter 6: Architecting a Real-Time Processing Pipeline

9. Chapter 7: Core Architectural Design Patterns

10. Chapter 8: Enabling Data Security and Governance

11. Section 3 – Enabling Data as a Service

12. Chapter 9: Exposing MongoDB Data as a Service

13. Chapter 10: Federated and Scalable DaaS with GraphQL

14. Section 4 – Choosing Suitable Data Architecture

15. Chapter 11: Measuring Performance and Benchmarking Your Applications

16. Chapter 12: Evaluating, Recommending, and Presenting Your Solutions

17. Index

Why subscribe?

18. Other Books You May Enjoy

Practical data governance using DataHub and NiFi

In this section, we will discuss a tool called DataHub and how different data stakeholders and stewards can make use of it to enable better data governance. But first, we will understand the use case and what we are trying to achieve.

In this section, we will build a data governance capability around a data ingestion pipeline. This data ingestion pipeline will fetch any new objects from an S3 location, enrich them, and store the data in a MySQL table. In this particular use case, we are getting telephone recharge or top-up events in an S3 bucket from various sources such as mobile or the web. We are enriching this data and storing it in a MySQL database using an Apache NiFi pipeline.

Apache NiFi is a powerful and reliable drag-and-drop visual tool that allows you to easily process and distribute data. It creates directed graphs to create a workflow or a data pipeline. It consists of the following high-level components so that...

The rest of the chapter is locked

You're reading from Scalable Data Architecture with Java Build efficient enterprise-grade data architecting solutions using Java

Table of Contents (19) Chapters

Practical data governance using DataHub and NiFi

Authors (1)

Personalised recommendations for you

You're reading from Scalable Data Architecture with Java Build efficient enterprise-grade data architecting solutions using Java

Table of Contents (19) Chapters

Practical data governance using DataHub and NiFi

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you