The first example - a numerical summarization application
One of the most common needs when you have a big set of data is to process its elements to measure certain characteristics. For example, if you have a set with the products purchased in a shop, you can count the number of products you have sold, the number of units per product you have sold, or the average amount that each customer spent. We have named that process numerical summarization.
In this chapter, we are going to use streams to obtain some measures of the Online Retail dataset of the UCI Machine Learning Repository, which you can download from http://archive.ics.uci.edu/ml/datasets/Online+Retail. This dataset stores all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail.
Unlike other chapters, in this case, we explain the concurrent version using streams and then how to implement a serial equivalent version to verify that concurrency improves performance with streams...