We know that when we produced the data, it was in JSON format, although Spark reads it in binary format. To convert the binary message to string, we use the following code:
Dataset<Row> healthCheckJsonDf =
inputDataset.selectExpr("CAST(value AS STRING)");
The Dataset console output is now human-readable, and is shown as follows:
+--------------------------+
| value|
+--------------------------+
| {"event":"HEALTH_CHECK...|
+--------------------------+
The next step is to provide the fields list to specify the data structure of the JSON message, as follows:
StructType struct = new StructType()
.add("event", DataTypes.StringType)
.add("factory", DataTypes.StringType)
.add("serialNumber", DataTypes.StringType)
.add("type", DataTypes.StringType)
.add("status...