Setting up the project
Before writing code, let's remember the project requirements for the stream processing application. Recall that customer sees BTC price events happen in the customer's web browser and are dispatched to Kafka via an HTTP event collector. Events are created in an environment out of the control of Doubloon. The first step is to validate that the input events have the correct structure. Remember that defective events could create bad data (most data scientists agree that a lot of time could be saved if input data were clean).
Getting ready
Putting it all together, the specification is to create a stream application which does the following:
- Reads individual events from a Kafka topic called raw-messages
- Validates the event, sending any invalid message to a dedicated Kafka topic called invalid-messages
- Writes the correct events to a Kafka topic called valid-messages, and writes corrupted messages to an invalid-messages topic
All this is detailed in the following diagram, the first...