Basic stream processing and computational techniques
We will now describe some basic computations that can be performed on the stream of data. If we must run summary operations such as aggregations or histograms with limits on memory and speed, we can be sure that some kind of trade-off will be needed. Two well-known types of approximations in these situations are:
- ϵ Approximation: The computation is close to the exact value within the fraction ϵ of error.
- (ϵ, δ) Approximation: The computation is close to the exact value within 1 ± ϵ with probability within 1 – δ.
Stream computations
We will illustrate some basic computations and aggregations to highlight the difference between batch and stream-based calculations when we must compute basic operations with constraints on memory and yet consider the entire data:
- Frequency count or point queries: The generic technique of Count-Min Sketch has been successfully applied to perform various summarizations...