In this chapter, we will look at the components that go into designing real-time streaming pipelines. We start with comparing stream and batch processing, followed by key streaming challenges, touch base on key streaming concepts. Traditional analytics solutions are designed around the concept of batch operations that move data between different persisted data stores. Users issue a query against the persisted data a rest to do ad hoc analysis, dashboards, or scorecards. This approach has been in use for a number of years and is still very much a relevant solution to business operations today.
To extract data insights on streaming data set requires a different type of approach and technology paradigm. We will focus on those aspects the next sections. First let's begin with a quick comparison between stream and batch processing...