Processing structured streaming data with Azure Databricks
Streaming data refers to a continuous stream of data from one or more sources, such as IoT devices, application logs, and more. This streaming data can be either be processed record by record or in batches (sliding window) as required. A popular example of stream processing is finding fraudulent credit card transactions as and when they happen.
In this recipe, we'll use Azure Databricks to process customer orders as and when they happen, and then aggregate and save the orders in Azure Synapse SQL pool.
We'll simulate streaming data by reading the orders.csv
file and sending the data row by row to an Azure Event hub. We'll then read the events from the Azure Event hub, before processing and storing the aggregated data in an Azure Synapse SQL pool.
Getting ready
To get started, follow these steps:
- Log into https://portal.azure.com using your Azure credentials.
- You will need an existing Azure...