How to scale queries using Streaming units and partitions
In traditional static data scenarios, the query can be executed against a fixed data set and results will be available after a known interval. On the other hand, with streaming data scenario involving constant changes to a dataset, the queries will run longer duration or might not even complete.
Additionally, a constant stream of data will increase the volume of data and query will drain the working memory. One way to draw data boundary is through the context of time. For example with streaming dataset, we can specify a data boundary that resides within the start and ends time. This will restrict the query execution between a known boundary. Application and arrival time are the two type of timing constraints we can use to set time boundaries for the streaming data. Â
Application and Arrival Time
Time at the event origin is known as the Application Time, time at event landing is called the Arrival Time. Within the queries, we can use...