Data pipelines with Amazon Kinesis Data Streams
As we have learned how to create streams, producers, and consumers, we will design a simple data pipeline for SmartCity. A data pipeline is a series of processing steps applied to data flowing from the source to the target destination. The processing steps could include automation for copying, transforming, routing, and loading source data to destinations such as business systems, data lakes, and data warehouses. A data pipeline should support the requirements for data throughput, reliability, and latency. A well-architected design will prevent many of the common problems that can occur when collecting and loading data, such as data corruption, bottlenecks, conflicts between sources, and the creation of duplicate entries.
Data pipeline design (simple)
With this first design, this demonstrates receiving data from a single source of data. The data source producer is using Amazon Kinesis Agent deployed in the SmartCity data center...