MapReduce as a pattern and programming model has been around for many years, arising from parallel computing research and industry implementations. Most famously, MapReduce hit the mainstream with Google's 2004 paper entitled MapReduce—Simplified Data Processing on Large Clusters (https://research.google.com/archive/mapreduce.html). Much of the benefit of Google's initial MapReduce implementation was:
- Automatic parallelization and distribution
- Fault-tolerance
- I/O scheduling
- Status and monitoring
If you take a step back and look at that list, it should look familiar. FaaS systems such as AWS Lambda give us most of these benefits. While status and monitoring aren't inherently baked into FaaS platforms, there are ways to ensure our functions are executing successfully. On that same topic, MapReduce systems were initially...