Building a real-time pipeline
For the real-time pipeline, we will use the same data simulation code we used in Chapter 8 with an enhanced architecture. In Figure 10.9, you can find an architecture design of the pipeline we are about to build:
Figure 10.9 – Real-time data pipeline architecture
First thing, we need to create a virtual private cloud (VPC) – a private network – on AWS and set up a Relational Database Service (RDS) Postgres database that will work as our data source:
- Go to the AWS console and navigate to the VPC page. On the VPC page, click on Create VPC, and you will get to the configuration page.
- Make sure VPC and more is selected. Type
bdok
in the Name tag auto-generation block and check the Auto-generate box so that AWS will generate all resources’ names according to the project name. For IPv4 CIDR block, let’s use the10.20.0.0/16
Classless Inter-Domain Routing (CIDR) block. Leave the rest...