Spark on Mesos
Apache Spark is a powerful open source processing engine built around speed, ease of use, and sophisticated analytics. It is currently one of the fastest growing big data technologies and is used by several leading companies in production.
Interestingly, Apache Spark was first started as a research project in 2009 at AmpLab, UC Berkeley, to prove that a distributed processing framework leveraging memory resources can run atop Apache Mesos. It was open sourced in 2010, entered the Apache incubator in 2013, and became an Apache top-level project in 2014. In its short existence, Apache Spark has managed to capture the attention of the developer community and is slowly finding its way into the lexicon of business decision makers as well. This, along with the fact that it is now in production in over 5000 organizations, speaks volumes about its versatility and utility.
Why Spark
With earlier distributed parallel computation frameworks such as Map Reduce, each computation step had...