In this recipe, we explore the concept of queueStream(), which is a valuable tool while trying to get a streaming program to work during the development cycle. We found the queueStream() API very useful and felt that other developers can benefit from a recipe that fully demonstrates its usage.
We start by simulating a user browsing various URLs associated with different web pages using the program ClickGenerator.scala and then proceed to consume and tabulate the data (user behavior/visits) using the ClickStream.scala program:
We use Spark's streaming API with Dstream(), which will require the use of a streaming context. We are calling this out explicitly to highlight one of the differences between Spark streaming and the Spark structured streaming programming model.