The data scientist and Spark features
One of the interesting questions relevant to this book is, "What do data scientists want?" It is a question that is being discussed and debated in many blogs. A short answer is as follows:
The ability to explore, model, and reason data at scale-because many of their algorithms get asymptotically better with data, and so, a small Dataset sample is not enough for exploring different algorithms
The ability to deploy without a lot of impedance
The facility to evolve models once they are in production and the real world is using them
In short, all we ask for is the shortest path from the lab to the factory, enabling a data scientist DevOps person! The following screenshot (combining talks from Josh Willis and Ian Buss), which displays The Sense & Sensibility of a Data Scientist DevOps, succinctly shows the value of Apache Spark to a data scientist by addressing three points:
Who is this data scientist DevOps person?
Of course, we really do not want to start...