The first example – a numerical summarization application
One of the most common needs when you have a big set of data is to process its elements to measure certain characteristics. For example, if you have a set with the products purchased in a shop, you can count the number of products you have sold, the number of units per product you have sold, or the average amount that each customer spent on it. We have named that process numerical summarization.
In this chapter, we are going to use streams to obtain some measures of the Bank Marketing dataset of the UCI Machine Learning Repository that you can download from http://archive.ics.uci.edu/ml/datasets/Bank+Marketing. Specifically, we have used the bank-additional-full.csv
file. This dataset stores information about marketing campaigns of a Portuguese banking institution.
Unlike other chapters, in this case, we explain the concurrent version using streams and then how to implement a serial equivalent version to verify that concurrency improves...