MapReduce is a programming model used to process very large datasets in distributed environments using a lot of machines working in a cluster. This programming model has the following two operations:
- Map: This operation filters and transforms the original elements into a form more suitable to the reduction operation
- Reduce: This operation generates a summary result from all the elements, for example, the sum or the average of numeric values
This programming model has been commonly used in the functional programming world. In the Java ecosystem, the Hadoop project of the Apache Software Foundation provides an implementation of this model. The Stream class implements two different reduce operations:
- The pure reduce operation, implemented in the different versions of the reduce() method that process a stream of elements to obtain a value
- The mutable reduction implemented in the...