Introducing streaming data applications
Traditional batch applications typically ran for hours, processing all or most of the data stored in relational databases. More recently, Hadoop-based systems have been to support MapReduce-based batch jobs to process very large volumes of distributed data. In contrast, stream processing occurs on streaming data that is continuously generated. Such processing is used in a wide variety of analytics applications that compute correlations between events, aggregate values, sample incoming data, and so on.
Stream processing typically ingests a sequence of data and incrementally computes statistics and other functions on a record-by-record/event-by-event basis, or over sliding time windows, on the fly.
Increasingly, streaming data applications are applying machine algorithms and Complex Event Processing (CEP) algorithms to provide strategic insights and the ability to quickly and intelligently react to rapidly changing business conditions. Such applications...