In this section, we are going to apply data streaming with Kafka and Spark to a use case scenario of a DL4J application. The DL4J module we are going to use is DataVec.
Let's consider the example that we presented in the Spark Streaming and Kafka section. What we want to achieve is direct Kafka streaming with Spark, then apply DataVec transformations on the incoming data as soon as it arrives, before using it downstream.
Let's define the input schema first. This is the schema we expect for the messages that are consumed from a Kafka topic. The schema structure is the same as for the classic Iris dataset (https://en.wikipedia.org/wiki/Iris_flower_data_set):
val inputDataSchema = new Schema.Builder()
.addColumnsDouble("Sepal length", "Sepal width", "Petal length", "Petal width")
.addColumnInteger...