You're reading from Building Big Data Pipelines with Apache Beam Use a single programming model for both batch and stream data processing

Product type Paperback

Published in Jan 2022

Publisher Packt

ISBN-13 9781800564930

Length 342 pages

Edition 1st Edition

Languages

Python

Tools

Apache Beam

Concepts

Big Data

Author (1):

Jan Lukavský

View More author details

Table of Contents (13) Chapters

Preface

1. Section 1 Apache Beam: Essentials

2. Chapter 1: Introduction to Data Processing with Apache Beam FREE CHAPTER

3. Chapter 2: Implementing, Testing, and Deploying Basic Pipelines

4. Chapter 3: Implementing Pipelines Using Stateful Processing

5. Section 2 Apache Beam: Toward Improving Usability

6. Chapter 4: Structuring Code for Reusability

7. Chapter 5: Using SQL for Pipeline Implementation

8. Chapter 6: Using Your Preferred Language with Portability

9. Section 3 Apache Beam: Advanced Concepts

10. Chapter 7: Extending Apache Beam's I/O Connectors

11. Chapter 8: Understanding How Runners Execute Pipelines

12. Other Books You May Enjoy

Setting up the environment for this book

In this section, we will set up the environment needed for this chapter and the rest of the book. The technologies we will build upon are Docker and minikube.

Minikube is a local version of Kubernetes, which will enable us to easily set up the other technologies we need.

Let's set up everything we need for this chapter now:

The steps to install minikube can be found at https://minikube.sigs.k8s.io/docs/start/.
Next, make sure to install the kubectl tool using the official Kubernetes instructions, which can be found at https://kubernetes.io/docs/tasks/tools/.
After installing minikube, we will start it by executing the following command:
```
$ minikube start
```
Important note
minikube accepts as an optional parameter a configurable amount of memory and number of CPUs. The minikube start command takes the optional --cpus and --memory arguments, which can be used to tune these settings. We recommend using all of the CPUs available...

The rest of the chapter is locked

You're reading from Building Big Data Pipelines with Apache Beam Use a single programming model for both batch and stream data processing

Table of Contents (13) Chapters

Setting up the environment for this book

Authors (1)

Personalised recommendations for you

You're reading from Building Big Data Pipelines with Apache Beam Use a single programming model for both batch and stream data processing

Table of Contents (13) Chapters

Setting up the environment for this book

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you