Spark Streaming is an extension of core APIs that provide fault-tolerant and high throughput processing of real-time data. It provides APIs that allow the scalable processing of data streams generating at a particular source. The source of the data could be any of the following:
- Click-stream data of websites
- Application logs
- Data coming over a TCP port