Designing a stream processing solution
Stream processing systems or real-time processing systems are systems that perform data processing in near real time. Think of stock market updates, real-time traffic updates, real-time credit card fraud detection, and more. Incoming data is processed as and when it arrives with very minimal latency, usually in the range of milliseconds to seconds. In Chapter 2, Designing a Data Storage Structure, we learned about the Data Lake architecture, where we saw two branches of processing: one for streaming and one for batch processing. In the previous chapter, Chapter 9, Designing and Developing a Batch Processing Solution, we focused on the batch processing pipeline. In this chapter, we will focus on stream processing. The blue boxes in the following diagram show the streaming pipeline:
Stream processing systems consist of four major components:
- An Event Ingestion...