What is Spark Streaming?
Spark Streaming was introduced in Spark 0.7 in early 2013 , with the objective of providing a fault-tolerant scalable architecture that could provide second-scale latency, with a simple programming model and integrated with batch and interactive processing. The industry had given into the idea of having separate platforms for batch and streaming operations, with Storm and Trident being the popular streaming engines of choice in the open source community. Storm would provide at least once semantics while Trident would provide exactly-once semantics. Spark Streaming revolutionized the concept of streaming by allowing users to perform streaming and batching within the same framework and by emphasizing the idea that users should not be worried about the state maintenance of objects. It is now one of the most popular Spark APIs and according to a recent Spark survey carried out by DataBricks, more than 50% of the users consider Spark Streaming as the most important component...